GeeksCodeAI-TheBetterLeetcode
As part of my internship, I was assigned an exciting and ambitious task to build a mock interview platform with the essence of Codeforces or LeetCode but enhanced with AI code review capabilities. This assignment was the initial challenge I had to complete before progressing further into the internship's selection phase, and I saw it as the perfect opportunity to apply my backend and full-stack skills in a real-world problem space.
The core idea was to create a developer-focused platform where:
- Users can attempt coding problems in a LeetCode-style interface
- Companies can post jobs and attach coding questions as part of their interview pipeline
- Submitted code is evaluated against predefined test cases
- AI provides automated reviews and suggestions based on the user's code
While the project also included a job posting and interview pipeline feature, this article focuses on the core of the coding platform — how the system evaluates submissions, how problems are set up, and how the backend handles the entire workflow securely and efficiently.
#Code Evaluation
The backbone of any coding platform lies in how it evaluates submitted code. Here's how I implemented that process:
- Each coding problem comes with multiple test cases (both sample and hidden).
- When a user submits a solution, the backend compiles and runs the code against these test cases.
- The result for each test case (pass/fail, output, error, etc.) is recorded.
- If all test cases pass, the submission is marked successful; otherwise, detailed feedback is shown.
This system provides users with instant feedback and encourages iterative debugging — just like any real-world online judge.
#Storing Test cases
Instead of hardcoding test cases in the database, test inputs/outputs are stored as files. These files are organized either in the file system or uploaded to an S3-compatible storage system. This makes it easy to manage a large number of problems and test cases efficiently. I used file system (github).
#Problem Setting & BoilerPlate Generation
To make the platform easy to maintain and scalable, I introduced a structured approach to problem setting:
Input By Problem setter
For each problem which comes under the /Problems directory, the problem setter has to provide:
-
Structure.md- describes:mdProblem Name: "Sum of two numbers"Function Name: "TwoSum"Input Structure:Input Field: int num1Input Field: int num2Output Structure:Output Field: int result -
Problem.md- A Rich Markdown description, contains the Problem Name, Problem description, TestCases and additional details in markdown formate to directly render on the browser. -
tests/- Each fileN.txt= JSON-encoded input list for test case N -
result/- Each fileN.txt= JSON-encoded expected result for test case N
Output generated automatically
I created a custom parser which parses the structure.md file given by user and extract the information in it. This information then used to create two autogenerated folders:
-
boilerplate/: Contains the starter function definition for each language that can be directly render on the broswer editor.function.py,function.cpp,function.goetc.
rsfn TwoSum(num1: i32, num2: i32) -> i32 {//Implementation goes hereresult} -
boilerplate-full/: It contains user's submitted function, code to read input fromtests/, a function call and a print result statementts##USER_CODE_HERE##const input = require('fs').readFileSync('/dev/stdin', 'utf-8').trim().split('\n').join(' ').split(' ');const num1 = parseInt(input.shift());const num2 = parseInt(input.shift());const result = sum(num1, num2);console.log(result);
#Secure Code Execution with Judge0 SandBoxing
Executing user-submitted code safely is a major concern for any online coding platform. Instead of building a custom sandboxing system from scratch, I integrated Judge0 — a powerful open-source API for running code in isolated environments.
Here's how Judge0 helped us:
- Secure & Isolated: Judge0 runs code in secure Docker-based sandboxes with no network access, preventing any malicious behavior.
- Language Support: It supports over 40+ languages, so I could easily allow users to code in their preferred language without custom setup.
- Built-in Resource Limits: Judge0 automatically enforces memory, CPU, and execution time limits, making it reliable and safe for handling real-world submissions.
#Asynchronous Code Processing with Judge0
Since Judge0's API operates asynchronously, our backend system was built around this model for efficient job handling:
- When a user submits code, the backend sends it to the Judge0 API and receives a token in response.
- Instead of blocking the request, I poll the API asynchronously or wait via webhooks (explained below) to check when execution is complete.
- Once the result is ready, I store the execution output, test case results, and any errors.
This non-blocking flow ensures our system remains performant and responsive, even with many users submitting code at the same time.
#Webhooks
To make the experience seamless, I used Judge0's webhook support for real-time updates:
- While submitting a code execution request, I attach a webhook URL.
- Once Judge0 finishes executing the code, it sends the results to our webhook endpoint.
- Our backend then processes the result and updates the user's submission status immediately.
This approach removed the need for continuous polling and provided real-time feedback to users, making the platform feel fast and interactive.