Project - LLM Chatbot for AutoTrader

TL;DR

Using GCP, Vertex AI, and Gemini 2.0 Flash, I built a chatbot that reads car advert data from AutoTrader's marketplace and answers user questions about specific vehicles. The chatbot uses guardrails to stay polite, accurate, and on topic while refusing to make up information it does not have. The main takeaway is that while LLMs offer real value for personalising user experiences, the risk of incorrect or offensive outputs means businesses must carefully weigh innovation against brand reputation.

The Objective

Buying a car online is a challenge. There are pages full of information, technical jargon, and details that force everyday buyers to learn a lot before making what is often their second largest purchase. So the question became: can we remove this friction and help users quickly find the information that matters most to them?

Goal: Build a chatbot using an LLM, that reads advert data from an online car marketplace and ensures the AI is safe and reliable to use.

A chatbot like this could transform pages of dense information into a personalised conversation that speaks directly to each user. However, the risk of the LLM producing incorrect or offensive content is a concern, and businesses must balance the value of innovation against potential damage to their brand.

Data & Context

Source: Internal AutoTrader data

Columns: vehicle_make, vehicle_model, vehicle_price, vehicle_mileage, key_features, and more

Key Variables: Gemini 2.0 Flash model and live advert data

Limitations: LLM hallucinations and missing advert data fields

My Approach

I chose Gemini 2.0 Flash from Vertex AI's Model Garden because I needed a model that was both cost effective and quick to deploy for a hackathon. This was not a production release, so speed and experimentation were our main priorities.

Researched available models in Vertex AI's Model Garden, focusing on options that were cheaper and faster since this was a rapid prototype rather than a full release.

Built an initial version in Databricks using a small static dataset with four or five variables. Created simple system instructions that limited the LLM to only answer questions based on the data passed through.

Created an endpoint using GCP's Cloud Run so the LLM could communicate with the front end without needing to build an entire backend system.

Added extraction functions which took the JSON request and mapped the data to variables which could be used by the LLM function.

Tested the released LLM thoroughly across the team, making adjustments when responses were formatted poorly or when the LLM could be tricked into ignoring its instructions.

Released a local version for colleague testing and gathered feedback on prompts that bypassed the guardrails. We patched these issues and continued testing.

Key decision: I chose to use a pre-trained model rather than training our own. This meant working within the constraints we set through system instructions rather than customising the model itself, which saved significant time.

Screenshot of the chatbot interface showing example queries — **Figure 1: Front End Chatbot Interface** The chatbot demonstrating basic but key queries it would likely receive and its response. (Internal Backend code will not be shown is this project)

Findings

Real Value for Users

The chatbot successfully transformed dense advert pages into personalised conversations, letting users quickly find specific details like infotainment systems or finance options without having to read the entire advert.

Hallucinations Are a Challenge

LLMs are a challenge to wrangle. Even with clear instructions, the model would sometimes miss guidelines after longer or difficult conversations. It took a lot of trial and error to get consistent behaviour.

Accountability Concerns

I can see why businesses often outsource their LLM solutions. If something goes wrong, the blame can be shifted to the provider, and the service can be swapped out before serious brand damage occurs.

Recommendation: Whether building an LLM chatbot makes sense depends on how a business positions itself. While it offers an innovative way to personalise user experiences, the production costs and brand risks of building in house are significant and need careful consideration. This would only be recommended if the business could provide adequate support and monitoring to manage the LLM's behaviour over time.

Reflection

This project taught me a great deal about building endpoints, working with Vertex AI's Model Garden, and crafting effective system instructions for LLMs.

If I had more time, I would look to build and train a model instead of using a pre-trained version. I would also build a more robust backend instead of relying on just the GCP endpoint. I would also look to add monitoring and sentiment analysis to spot negative conversations and use those insights to improve the instructions. In future projects, I will have a much better starting point and will not need as much trial and error to get the LLM behaving as expected.

Tools & Technologies

Python Large Language Models Generative AI GCP Vertex AI APIs

Building an LLM Chatbot for Car Buyers