Usama Iftikhar — machine learning engineer.

I build intelligent AI systems — conversational agents with voice and multimodal capabilities, computer vision pipelines, and production-ready LLM integrations. I work across the full lifecycle, from prototyping ideas to deploying scalable systems in the cloud.

I also believe in giving back to the community, which is why I build open source tools that make AI more accessible. Get in touch, or find me on GitHub, LinkedIn, and X.

Open source

Hermex

Drive ChatGPT and Gemini from Python — no API keys, no billing, just the free web UI.

Open-source Python library that lets you script the free ChatGPT and Gemini web interfaces directly — so you can prototype, batch, and automate without paying for API access. Useful for research, one-off automations, and anyone hitting paywalls before they've validated an idea. MIT licensed.

from hermex import Gemini

response = Gemini.simple_query("Explain attention in one paragraph")
print(response.text)
Selected projects
  • Multi-Agent Voice Bot

    Voice AI · LiveKit · LLM

    Multi-agent customer service voice system built with LiveKit — callers are routed to specialized AI agents based on query type, mimicking real call center escalation flows.

  • Multimodal AI Chatbot

    Multimodal · RAG · LLM

    Conversational AI that responds with text, images, and videos, extracting real-time data from live sources to help users retrieve information for informed decision-making.

Experience

A career that started in full-stack web development and evolved through computer vision into the LLM and voice AI space of today.

  1. Nov 2023 — Present
    Sr. Machine Learning Engineer — AI Dev Lab

    Working at the intersection of LLMs, voice AI, and multimodal systems — building production-ready agents and cloud-deployed AI pipelines.

  2. Jul 2022 — Oct 2023
    Machine Learning Engineer — Vacon.ai

    Moved into computer vision and generative AI — working with Stable Diffusion, YOLO, Grounding DINO and CLIP before LLMs became the industry's main focus.

  3. Jun 2021 — Jun 2022
    Full Stack Developer — Cipher Coders

    Built scalable web applications across the full stack — React.js, Node.js, Next.js and Python. This is where I learned to ship real products.

Skills

Python, LLMs, RAG, LangChain, LlamaIndex, FastAPI, AWS, Computer Vision, NLP

Education

Master's in Data Science, PUCIT — 4.0 GPA

Contact