Auto Trader Engineering Blog

  • How we used Databricks notebooks, MLeap and Kubernetes to productionize Spark ML faster

    Machine learning (ML) models are nothing new to us. We’ve used them to power products such as our used car valuations and Price Indicators. The processes involved in training and serving these models are complex, involving code written by both data scientists and developers in a variety of languages including Python, Java and R. Serving predictions from these models in real time typically involves a custom application to extract the coefficients from the saved model, and apply them to an input request.

    Read more…

  • Implementing a Projection Search on a RESTful Web Service

    What do we mean by Projection Search? Almost all web applications need to provide their user with a summarised view of their data. An email inbox is a good example or in our case a list of their stock items.

    This is usually modelled as a search in the underlying web service, the search returning a list of resource summaries. This summary contains a few fields and a link to the actual resource.

    Our recurring design headache is to decide which fields are included in the summary object.

    Read more…

  • Sending Spark logs to ELK using logstash-gelf

    An important part of any application is its underlying log system. Logs are fundamental for debugging and traceability, but can also be useful for further analysis and even in areas such as business intelligence and key performance indicators (KPIs). At Auto Trader, we emphasise the importance of building a robust application logging system that can be integrated into our ELK stack that serves as a centralised log store.

    Read more…

  • Let's Encrypt at Scale

    Enabling HTTPS on 3,000+ websites is a bit of a pain. But as we are now in the age of increasing online privacy, we had to knuckle down and find a way to do it. We provide a platform for trade dealers to upload and advertise their stock online. As an extension of this, we offer a product that allows customers to host a private website using this stock, under their own domain. We wanted to provide HTTPS support to all of these websites.

    Read more…

  • Customisable Logging on Kubernetes

    Centralised logging has been a part of Auto Trader for the best part of eight years—the last three of which we’ve adopted Elasticsearch as our choice of software.

    From application logs to system logs, monitoring logs to security logs; we try our best to provide a central location for users to log what they care about and we try to make it as easy as possible to do so. This is true for our current platform and we aim to make it true for our next generation platform, running Kubernetes on Google Cloud.

    This post will cover the solution we’ve implemented to achieve good logging on Kubernetes.

    Read more…

  • Speaking about Continuous Deployment at Pink18

    Last year Dave Whyte and I were lucky enough to be invited to speak at Pink18 in Orlando, Florida. The Pink Elephant conference is the worlds largest and most prestigious IT Service Management conference, having run for over two decades.

    This years theme was Integrated Service Management, which is about aligning traditional Service Management processes with DevOps and Lean/Agile.

    This post is about the talk I gave, the public speaking experience and the preparation required to deliver a cohesive talk.

    Read more…

  • A Boot Camp Career Reboot

    Software is eating the world say the pundits, and consequently companies like Auto Trader are hungry for developers. The excess of demand over supply has led to initiatives such as TechReturners and our own recent Returners Discovery Day, which aim to lure back those who have left the industry. But there is another source of coding talent which has emerged in recent years: the coding boot camp. Can a few months of coding practice be enough to prepare someone with little or no experience of coding for a career as a developer? Even someone like me?

    Read more…

  • Experiments and the 3 Lies we Pin on Them

    It is said that there are three types of lies: lies, damn lies and statistics. However, you don’t need a degree in statistics to be able to recognize these lies, and redesign your experiment to tell the truth.

    Before Christmas, we launched a new advert creation journey, and after significant development effort, I’ve been very invested in making it a success. Lean principles state that we should try to get a product or service out to the user as soon as possible, and observe the effects. However, the work we do to launch a new feature often motivates us to root for it too much, and ignore problematic ways in which we gather the data. This often results in a phenomenon known as the sunk cost fallacy, where people convince themselves to throw good money after bad, to avoid the pain of losing. Here are a few tips on how to spot the mistakes we often introduce into the data when experimenting on users.

    Read more…

  • Use templates for better Git commit messages

    Commit messages are important. They are a means of communication with yourself and your team throughout the life of your codebase (remember that team members are likely to come and go over time.) In fact, given that they live alongside the code, they’ll probably be the best source of documentation that you have about the evolution of your codebase. Commit messages are important!

    Read more…

  • Using AWS Lambdas for data lake monitoring

    Here at Auto Trader, a core part of our strategy is to help improve the process of buying and selling vehicles through the provision of data driven intelligence. This data driven approach isn’t just something we suggest to our customers; we also use data internally to drive our product development. For example, our product teams use KPI dashboards when adding new features or developing new products to ensure that they’re making informed decisions. The data that drives these dashboards needs to be correct, and we need to know if anything goes wrong.

    Read more…

  • How we (almost) completed a natural language recipe recommendation engine using Twitter in a 24-hour hackathon

    On 28th October we four Java developers attended HAC100’s Hack Manchester event. A 24-hour hackathon of coding, coffee and chaos (25 hours really as the clocks went back). We were one of three Auto Trader teams out of about 50 in total. Our team took on the dunnhumby challenge, “to use technology to enhance the retail experience for a customer in the home or in store.” By the end, we had built a personalised recipe recommendation engine with a natural language based Twitter interface.

    Read more…

  • Everything's a component—writing domain specific, re-usable Angular components across squads

    If you haven’t heard about how Auto Trader works yet, we’re structured into squads wherein each squad owns, maintains and develops within a particular domain to implement our business initiatives autonomously. Within the retailer products division of Auto Trader are several squads working on the multi-faceted ‘Dealer Portal’ product to help vehicle dealers optimise their daily workings. All of the disparate bits of technology to make Dealer Portal tick are encompassed under an umbrella project and common client/server technology stack we lovingly refer to as ‘Portal’. This post will discuss how we formed a strategy to maintain consistency across the Portal front-end.

    Read more…

  • Building a Fast Search Experience

    Auto Trader provides a search platform for dealers to buy vehicles from other traders. A high-performance search experience is critical, as this helps create a competitive marketplace for dealers to purchase vehicles. This blog post will take you through some of the changes we made to create a high-performance search platform that regularly returns results in less than a second.

    Read more…

  • Text Mining Our Dealer Reviews

    At Auto Trader, knowing our customer is very important to us, and we invest a lot in researching what makes them tick. In previous posts, we described how we used text mining to understand our customers through our customer support descriptions. We have another very valuable source of data: our dealer reviews.

    Read more…

  • Supporting Building a Data Lake

    Data is king. Information is power. It’s not just about storing lots of it though. There is no point having years of data without the ability to interrogate it and surface the information required in a timely manner. We at Auto Trader recognised the power of data some time ago. As our data set grew there was a realisation that we needed a better structure, we needed quicker queries.

    Read more…

  • Lead Developer 2017

    I was given the opportunity to attend the Lead Developer conference in London, which occurred on the 8th and 9th of June 2017 in the Queen Elizabeth Exhibition Centre in London. This was obviously exciting enough on its own, but it also coincided with the UK General Election, and the QEII Centre is situated right in the heart of Westminster, opposite the cathedral and behind the Supreme Court, just a stone’s throw from the Houses of Parliament.

    Read more…

  • It matters what you measure

    It’s not enough to simply measure things, you need to measure the right things! When I first introduced release reporting at Auto Trader I based the success criteria on the things the business cared about at the time. It was very basic and a release was deemed to have ‘failed’ if it was either backed out or needed to be fixed. As the way we measured success evolved we noticed how this didn’t really represent the direction we wanted to follow.

    Read more…

  • Resilience4j - a lightweight, flexible circuit breaker

    We knew that our application would break if the database was down. More precisely, we knew that our service end-point would time out when we were logging to the database some not-essential-but-useful information for each item in the request’s large batch.

    Read more…

  • Exception Handling Conundrums

    Our squad recently came across an unhelpful error message* in response from one of our APIs. After some digging around in the code, we discovered that the message was due to an exception that had not been handled appropriately and instead had bubbled up. This prompted us to sit down as a team and agree on some best practices around how we will deal with exceptions. This post will cover what we agreed on, in particular how to handle crossing knowledge boundaries, using examples from our codebase.

    Read more…

  • Adding GitHub Organisation Webhook Support to GoCD

    The bulk of our active codebases, over time, have made their home our GitHub Enterprise server. We also have a GoCD (continuous delivery pipeline) server that is polling these repositories to work out if it has something to do. The upshot of this is that, every minute, for each of these codebases, GoCD polls Github for changes. This consumes a lot of unneccessary CPU cycles (especially because some of these sources haven’t been updated recently) and is one of the reasons our GoCD server is slower than we’d like it to be. This blog post will talk about how we improved this and my experiences while contributing code back to the open source community.

    Read more…