New Project: Building an AI-Powered Code Review Assistant
Im starting a new side project, and I want to document the journey from day one.
Not the polished success story you read on Product Hunt, but the messy, uncertain beginning where you're making decisions with incomplete information.
The project is called ReviewBot – an AI-powered code review assistant specifically designed for Rails applications. Think of it as having a senior Rails developer looking over your shoulder, but one that never gets tired, never gets annoyed at repetitive questions, and costs less than a coffee per day.
What I'm Building (And Why)
The idea came from a painful reality I've observed over 12 years of Rails development: code review is either thorough and slow, or fast and superficial. There's rarely an in-between.
In small teams, you're lucky if anyone reviews your code at all. Everyone's busy shipping features. Pull requests sit for days. When someone finally reviews, it's a quick scan looking for obvious bugs, not thoughtful feedback on architecture or performance implications.
In larger teams, you get reviews, but they're inconsistent. One senior developer will catch that you're about to create an N+1 query nightmare. Another will approve the same pattern without comment. Junior developers learn different lessons depending on who reviews their code that day.
I wanted to build something that provides consistent, thoughtful feedback on every pull request. Not to replace human reviewers, but to catch the mechanical stuff so humans can focus on the interesting problems like business logic and architecture.
ReviewBot analyzes Rails pull requests and provides feedback on specific things: N+1 queries hiding in innocent-looking controller changes, missing database indexes that will cause problems at scale, security vulnerabilities in how parameters are handled, performance implications of eager loading strategies, and violations of Rails conventions that make code harder to maintain.
The goal isn't to be a linter. We have RuboCop for that. The goal is to understand context. If you add a controller action that loads a user's posts in a loop, ReviewBot doesn't just say "this is N+1" – it explains why it matters for your specific use case, shows you what the database queries will look like, and suggests the exact code change that would fix it.
Why This Needs to Exist
I've seen too many production incidents caused by code that looked fine in review but had subtle performance or security issues. A controller that worked perfectly with ten test records but fell apart with ten thousand real users. An authentication check that was almost right but had an edge case that leaked data. A database query that performed well on a developer's laptop but crushed the production database.
These aren't issues that RuboCop catches. They require understanding Rails conventions, database behavior, and common pitfalls. The kind of knowledge senior developers accumulate over years.
But senior developers are expensive and scarce. Most teams have one or two seniors reviewing code from five or ten other developers. The math doesn't work. Something falls through the cracks.
AI can help here. Not by replacing the senior developer's judgment on architecture or business logic, but by handling the pattern matching. Does this code follow Rails conventions? Is there an N+1 query? Are parameters properly sanitized? Is this database query going to perform well at scale?
These are mechanical checks that require knowledge but not creativity. Perfect for AI augmentation.
The Stack (And Why I Chose It)
I'm building this with Rails, obviously. Not just because I know Rails well, but because eating your own dog food matters. If I'm building a tool for Rails developers, it should be built with Rails. I want to experience the same pain points my users will experience.
The AI layer uses Claude through Anthropic's API. I tested ChatGPT, Gemini, and Claude on the same Rails code samples, and Claude consistently gave the most accurate and contextual feedback. It understands Ruby syntax better, catches Rails-specific antipatterns more reliably, and explains issues in a way that's actually helpful rather than patronizing.
I'm using PostgreSQL for the database because I need to store repository metadata, pull request history, and analysis results. SQLite would work for MVP, but I know I'll need Postgres features eventually, and migrating databases is painful. Starting with Postgres saves future headache.
For background jobs, I'm using Sidekiq. Analyzing pull requests takes time – sometimes thirty seconds or more for large diffs. I can't block the web request for that. Sidekiq lets me queue analysis jobs, process them asynchronously, and scale workers independently if needed.
The frontend is pure Rails with Hotwire. No React, no separate frontend repo, no build complexity. Turbo handles dynamic updates when analysis completes. Stimulus adds interactivity where needed. Import maps mean no webpack. The entire JavaScript footprint is under one kilobyte.
For GitHub integration, I'm using Octokit to interact with their API. When a pull request is opened, GitHub sends a webhook. My app receives it, queues an analysis job, fetches the diff, runs it through Claude with specific prompts, and posts the feedback as a pull request comment.
I'm deploying to a single Dokku server on a five-dollar-per-month DigitalOcean droplet. For MVP, this is plenty. Dokku gives me Heroku-like deployment workflow without the Heroku price tag. When I need to scale, I'll know because the server metrics will tell me.
The MVP Scope (What I'm Not Building)
This is where I had to make hard decisions. My brain wants to build everything. Support for GitLab and Bitbucket. Team analytics showing code quality trends over time. Integration with Slack for notifications. Custom rule configuration per repository. Historical analysis of existing codebases.
But that's how side projects die. You build for six months, launch with a hundred features, and discover users only care about three of them. The other ninety-seven were wasted effort.
So I'm shipping the absolute minimum that provides value. One core workflow: developer opens pull request, ReviewBot analyzes it, developer gets feedback as a comment. That's it.
No dashboard. No analytics. No configuration options. No team features. No integrations beyond GitHub. Just the core value proposition: automated Rails code review that actually understands your code.
If people use it and find it valuable, I'll add features based on what they actually ask for. If nobody uses it, I'll have wasted weeks instead of months.
The MVP includes only the most critical checks. N+1 query detection because it's the most common Rails performance problem. Missing index detection because it's the second most common. SQL injection vulnerability scanning because security matters. Mass assignment checks because it's a classic Rails security gotcha. Basic authentication and authorization review because these bugs leak data.
That's five types of checks. Each one catches real problems that cause real production incidents. If ReviewBot catches even one N+1 query before it hits production, it's paid for itself.
Technical Decisions I'm Confident About
Some choices feel obviously right. Using Rails is correct – building a Rails tool in anything else would be weird. Using Claude is correct – I tested the alternatives and Claude performs better for this specific use case. Using Hotwire is correct – the UI requirements are simple and don't need React's complexity.
Deploying to a single server is correct for MVP. Premature scaling is wasteful. I'll know when I need more servers because metrics will scream at me. Until then, keep it simple.
Using PostgreSQL over SQLite is borderline, but I'm comfortable with it. The overhead is minimal, and I'd rather not migrate later.
Technical Decisions I'm Uncertain About
Other choices feel less certain. I'm storing entire pull request diffs in the database. This could get large. Should I be storing them in object storage like S3 instead? Probably, eventually. But for MVP, keeping everything in Postgres is simpler.
I'm using Sidekiq for background jobs, which requires Redis. That's another moving part. Could I use Solid Queue instead and stay pure PostgreSQL? Maybe. But Sidekiq is battle-tested and I know it well. Switching later isn't that hard if needed.
I'm processing pull requests synchronously – one diff, one API call to Claude, one result. Should I be batching requests? Streaming responses? Probably, for cost optimization. But that's complexity I don't need yet.
I'm posting feedback as a single comment with all findings. Should it be one comment per issue? Should it use GitHub's review comment API to attach feedback to specific lines? Definitely, that would be better UX. But it's also more complex to implement. Starting simple, iterating based on feedback.
What Could Kill This Project
I'm going into this with eyes open about the risks. The biggest one is cost. Every pull request analyzed costs money – Claude API calls aren't free. If the product gets traction, API costs could outpace revenue. I need to either charge enough to cover costs plus margin, or optimize prompts to reduce token usage, or both.
The second risk is accuracy. If ReviewBot gives false positives constantly, developers will ignore it. If it misses real issues, it's not providing value. The line between helpful and annoying is thin. I need to tune prompts carefully and probably add feedback mechanisms so I can improve accuracy over time.
The third risk is competition. GitHub Copilot already does some code review. There are other AI code review tools launching. Am I late to this party? Maybe. But I think there's room for a tool specifically optimized for Rails, with deep understanding of Rails patterns and conventions. Generic tools try to do everything for every language. I'm betting that specialization wins.
The fourth risk is that nobody wants this. Maybe developers are happy with their current code review process. Maybe they don't trust AI feedback. Maybe they prefer human reviewers even if they're slow and inconsistent. I won't know until I ship and see if anyone uses it.
The Launch Plan
I'm giving myself four weeks to ship MVP. Not four months, four weeks. This forces brutal prioritization. Every feature that's not absolutely critical gets cut.
Week one is setup and core architecture. Rails app, GitHub OAuth, webhook handling, basic job queue. By end of week one, I should be able to receive a pull request webhook and queue a job.
Week two is the AI integration. Prompt engineering, Claude API integration, parsing responses, posting comments back to GitHub. By end of week two, the core workflow should work end-to-end, even if the feedback quality is rough.
Week three is refinement. Improving prompts based on test cases. Adding the five critical checks I defined. Making feedback actionable and clear. By end of week three, feedback should be genuinely helpful.
Week four is polish and launch prep. Landing page, documentation, pricing page, payment integration with Stripe. By end of week four, someone should be able to sign up, connect their GitHub repo, and start getting reviews.
Then I launch. Not on Product Hunt or Hacker News. Just quietly releasing to the world and telling a few Rails communities. Posting in Rails forums, sharing on Twitter, maybe writing a blog post about it.
If ten people try it and five keep using it after a week, that's validation. If nobody uses it or everyone churns immediately, I learn something and move on.
Why I'm Documenting This
I'm writing about this project from day one for a few reasons. First, it holds me accountable. Public commitment is harder to abandon than private plans. Second, it might help someone else starting a similar journey. Most project stories are told in retrospect, polished and sanitized. I want to capture the uncertainty and decisions in real time.
Third, writing clarifies thinking. Explaining my technical choices forces me to examine whether they're actually good choices or just comfortable defaults. If I can't articulate why I chose Postgres over SQLite, maybe I should reconsider.
And fourth, if this project succeeds, having documented the journey from the start could be valuable content. If it fails, at least I'll have documented what didn't work and why.
What's Next
This week I'm building the skeleton. Rails app, GitHub integration, webhook handling, basic UI. Nothing fancy, just plumbing. By Friday I should be able to receive pull request webhooks and see them in a simple dashboard.
Next week I'll tackle the AI integration. Getting Claude to analyze code diffs and return structured feedback. This is the hard part. Prompt engineering is more art than science. I'll probably spend days tweaking prompts to get useful, accurate feedback.
I'll write another post next week with progress, problems encountered, and lessons learned. Real-time documentation of what's working and what's not.
If you're building something similar or just interested in following along, I'd love to hear from you. What would make an AI code review tool actually useful for your workflow? What would make you ignore it?
Let's see where this goes.