Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.
Role Overview
We are seeking highly skilled and motivated Applied AI Research Scientists in Computer Science and Computer Engineering with an MS or Ph.D. in a relevant technical field. In this role, you will contribute to the design, validation, and execution of expert-level evaluation tasks that probe the limits of state-of-the-art AI systems. Your work will focus on creating headroom-level, rigorously verifiable questions across hardware, systems, and computing domains to assess and stress-test advanced multimodal and reasoning-capable AI models.
This position requires deep domain expertise, strong analytical rigor, and the ability to translate complex technical concepts into precise, evaluable challenges that expose model limitations beyond surface-level reasoning. You will work closely with a collaborative, cross-functional team and are expected to be a reliable team player who is highly detail-oriented and committed to accuracy and quality.
Duration: 3-6 months
Commitment: 8 hours per day, with 4 hours of mandatory overlap with PST
Model: Contract, time and material
Location: 100% Remote (LATAM, Europe, MENA, Canada)
Interview Process: Take-home assignment + 1h technical/cultural interview
Key Responsibilities
- Design graduate- and research-level evaluation questions grounded in hardware and computer engineering domains.
- Create tasks that require precise, step-by-step technical reasoning with objectively verifiable ground-truth answers.
- Develop multimodal prompts, including accurate block diagrams, timing diagrams, microarchitecture diagrams, or circuit-level visuals when appropriate.
- Evaluate state-of-the-art AI models on hardware- and systems-heavy reasoning tasks and perform structured side-by-side comparisons.
- Identify and document model failure modes related to architectural correctness, performance reasoning, or low-level system behavior.
- Provide authoritative solutions and explanations for each evaluation task.
- Maintain detailed and accurate records of prompts, expected answers, and evaluation outcomes in shared tracking systems.
- Collaborate with reviewers and researchers to refine evaluation quality
Requirements
- MS or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, Information Technology, Data Science, or a closely related field
- Strong expertise in at least two of the below hardware- and systems-focused domains:
- Computer architecture (pipelines, memory hierarchies, cache coherence, ISA-level reasoning)
- Hardware systems and performance analysis
- VLSI design, digital logic, or ASIC/FPGA fundamentals
- Embedded systems and low-level firmware
- Operating systems (especially memory management, scheduling, and hardware–software interfaces)
- Compilers or systems programming with hardware awareness
- Proven experience with applied AI research, technical evaluation, or research-driven problem formulation in real-world or production-oriented settings
- Strong programming proficiency, with experience in Python for analysis, verification, and evaluation workflows
- Strong written communication skills and the ability to collaborate effectively as a detail-oriented team player
- Familiarity with modern AI model capabilities, limitations, and benchmarking practices is a strong plus
Benefits
- Work in a fully remote environment.
- Opportunity to work on cutting-edge AI projects with leading LLM companies.