Sequence 03 - CV 项目深入讲解和深挖准备
Top

Sequence 03 - CV 项目深入讲解和深挖准备

面试官大概率会让你讲项目。项目不是按业务讲,而是按系统能力讲。

统一模板:

背景是什么?
系统架构是什么?
你负责什么?
关键瓶颈是什么?
你怎么定位?
你做了什么设计/优化?
怎么验证效果?
踩过什么坑?
和 NVIDIA JD 有什么关系?

1. 项目总图

flowchart TB
    CV[CV Projects] --> P1[AI Risk-Control and LLM Strategy Platform]
    CV --> P2[Rust + Vulkan Rendering Pipeline]
    CV --> P3[Diamond Renderer / WebGPU Compute]
    CV --> P4[High-Concurrency Communication Systems]
    CV --> P5[Advantest Semiconductor ATE Software]
    CV --> P6[NRI Quant / Compute / OpenCV Pipelines]

    P1 --> JD1[AI inference / serving / observability]
    P2 --> JD2[GPU profiling / memory / workload behavior]
    P3 --> JD3[GPU compute / numerical behavior]
    P4 --> JD4[networking / backpressure / tail latency]
    P5 --> JD5[HW/SW joint debugging / diagnostics]
    P6 --> JD6[C++/Python compute pipeline / correctness]

2. 项目 1:AI Risk-Control and LLM Strategy Platform

2.1 怎么讲

这是最接近 JD 的项目。不要只讲“做了 AI 风控”,要讲成 inference/runtime/system 项目:

This project built an AI risk-control and LLM strategy platform that connected feature generation, online inference, retrieval-assisted analysis, anomaly detection, and strategy execution. My focus was not only model integration, but also production inference path, observability, replayability, correctness validation, and performance-sensitive workflow design.

2.2 系统图

flowchart LR
    Event[Production Events] --> Feature[Feature Generation]
    Feature --> Queue[Queue / Stream]
    Queue --> Infer[Online Inference]
    Infer --> LLM[LLM / Retrieval / Agent]
    LLM --> Strategy[Strategy Engine]
    Strategy --> Action[Decision / Risk Action]
    Action --> Feedback[Effectiveness Feedback]
    Feedback --> Eval[Evaluation / Replay]
    Eval --> Feature

    Infer --> Obs[Metrics / Traces / Logs]
    LLM --> Obs
    Strategy --> Obs

2.3 面试会深挖

追问 你要答什么
online inference path 怎么设计? 请求进入、特征生成、模型调用、策略执行、反馈闭环。
latency 怎么控制? cache、batching、异步化、限流、降级、关键路径 profiling。
怎么验证正确性? replay、shadow、A/B、指标对齐、异常样本回放。
LLM 部分怎么接生产? retrieval、agent workflow、输出校验、策略侧 guardrail。
和 NVIDIA inference 有什么关系? 同样关注 serving path、latency/throughput、KV/cache、observability。

2.4 要补到 JD 语言

把项目语言升级为:

业务 AI 风控 -> production inference path
日志监控 -> observability and replayability
策略效果 -> end-to-end validation
请求慢 -> TTFT/P99/queueing/debug
LLM 调用 -> serving runtime and scheduling

3. 项目 2:Rust + Vulkan Rendering Pipeline

3.1 怎么讲

这是你连接 CUDA/GPU profiling 的主要项目。

I built a Rust + Vulkan rendering pipeline and used RenderDoc/Nsight to study GPU workload behavior, resource utilization, memory behavior, and performance-quality tradeoffs. Although it was not CUDA production work, the transferable skills are GPU timeline analysis, resource bottleneck isolation, memory/layout reasoning, and profiling-driven optimization.

3.2 系统图

flowchart TB
    Scene[Scene Data] --> CPU[CPU Scene Prep]
    CPU --> Upload[Buffer / Texture Upload]
    Upload --> GPU[GPU Pipeline]
    GPU --> Pass1[Geometry Pass]
    GPU --> Pass2[Lighting Pass]
    GPU --> Pass3[Post Process]
    GPU --> Present[Present]

    GPU --> Metrics[GPU Time / Resource Usage / Memory Behavior]
    Metrics --> Tools[RenderDoc / Nsight]
    Tools --> Optimization[Culling / LOD / Layout / Pass Optimization]

3.3 面试会深挖

追问 回答方向
Vulkan 和 CUDA 有什么可迁移? GPU execution、memory locality、async command、profiling、resource bottleneck。
你怎么定位 GPU 瓶颈? 先 timeline,再 pass/kernel,最后看 memory/resource/stall。
Nsight 看什么? GPU timeline、draw/dispatch、copy、sync、GPU idle。
和 CUDA 有什么差距? CUDA kernel metrics 更细,需要补 Nsight Compute、warp、coalescing。

3.4 不能夸大

不要说:

我有 CUDA kernel production ownership。

应该说:

My production GPU-adjacent work was Vulkan/WebGPU rather than CUDA kernel ownership, but the profiling and bottleneck isolation mindset is transferable. I am closing the CUDA-specific gap with focused CUDA/Nsight experiments.

4. 项目 3:Diamond Renderer / WebGPU Compute

4.1 怎么讲

这个项目适合证明你能处理 GPU compute、numerical behavior、performance-quality tradeoff。

flowchart LR
    Input[Geometry / Material] --> Shader[WebGPU Shader]
    Shader --> Physics[Refraction / Dispersion / TIR]
    Physics --> Render[Real-time Result]
    Render --> Tradeoff[Quality vs Performance]
    Tradeoff --> Profile[GPU Timing / Visual Validation]

面试连接点:

JD 点 项目连接
GPU programming models WebGPU shader/compute mental model
performance-quality tradeoff real-time rendering constraints
profiling GPU timing and workload analysis
numerical behavior optics simulation correctness

5. 项目 4:High-Concurrency Communication Systems

这是你连接 AI networking 的关键非 GPU 项目。

flowchart TB
    Client[Clients] --> Gateway[Gateway / Connection Management]
    Gateway --> Router[Routing]
    Router --> Service[Backend Services]
    Service --> Queue[Kafka / Queue]
    Queue --> Worker[Workers]
    Worker --> Result[Response / Events]

    Gateway --> Metrics[Latency / P99 / Connection Count]
    Queue --> Backpressure[Backpressure]
    Metrics --> Debug[Hot-path / Tail Latency Debug]

面试深挖:

追问 你要讲
100k connections 怎么支撑? connection lifecycle、event loop/thread model、backpressure、routing、observability。
sub-50ms 怎么保证? critical path、queueing、hot path、native acceleration、指标拆解。
和 GPU networking 有什么关系? 都是 data path、tail latency、backpressure、transport、observability,只是硬件层不同。

6. 项目 5:Advantest Semiconductor ATE Software

这是你连接 HW/SW joint debugging 的项目。

flowchart LR
    TestPlan[Test Plan] --> Control[ATE Control Software]
    Control --> Device[Device Under Test]
    Device --> Measure[Measurement Data]
    Measure --> Diagnose[Diagnostic Tooling]
    Diagnose --> Engineer[HW / Validation Engineer]
    Engineer --> Fix[Workflow / Calibration / Software Fix]

面试连接:

JD 点 项目连接
hardware features 和硬件工程师一起分析异常结果
correctness-sensitive systems measurement correctness / repeatability
diagnostics structured debugging / issue closure
system architecture test workflow orchestration

7. 项目 6:NRI Quant / Compute / OpenCV Pipelines

这是你连接 C++/Python compute pipeline、correctness、traceability 的项目。

flowchart TB
    Data[Raw Data] --> Clean[Cleaning]
    Clean --> Feature[Feature Computation]
    Feature --> Model[Model / Backtest]
    Model --> Sim[Execution Simulation]
    Sim --> Eval[Evaluation]
    Eval --> Report[Reporting / Traceability]

    Image[Image Data] --> OpenCV[OpenCV Pipeline]
    OpenCV --> Classify[Classification / Recognition]

面试连接:

JD 点 项目连接
Python/C++ compute pipelines
algorithm design feature/model/evaluation
correctness traceability/backtesting
performance batch computation / numerical processing

8. 项目讲解总模板

每个项目最终都按这个格式讲:

1. Context:
   这个项目解决什么生产/系统问题。

2. Architecture:
   数据流、控制流、关键模块。

3. My role:
   我负责什么,不夸大。

4. Bottleneck:
   性能/正确性/稳定性/可观测性问题是什么。

5. Approach:
   我怎么定位、怎么设计、怎么验证。

6. Result:
   指标、稳定性、可复现性、工程收益。

7. NVIDIA relevance:
   它和 AI inference、GPU profiling、networking/data movement、architecture 的关系。
Top