AI-Generated Code and Fake Senior Laravel Code

AI Did Not Replace Junior Developers. It Created Fake Senior Code.

AI-generated backend code is not hard to reject when it is obviously broken. The harder case is code that looks organized and still skips production rules.

This study uses a small Laravel wallet-transfer project. One user sends credits to another user. The same feature is implemented twice: once as code that looks clean but misses key boundaries, and once as production-aware code with validation, authorization, transactions, row locks, failure records, logs, and tests.

The study command runs 16 scenarios against both services. The fake service passed 9 and failed 7, scoring 51.7 out of 100. The real service passed all 16 and scored 100 out of 100.

The Claim

The problem is not that AI can generate bad code. Developers wrote bad code before AI. The current risk is different: generated code can look complete before it has handled the rules that decide whether it is safe to ship.

The fake implementation in this project is not nonsense. It has a service class, readable method names, a shared result object, a numeric amount check, a negative amount check, and a successful transaction record. On the happy path, it works.

That is still not enough for wallet logic. A transfer service must check who is spending, reject invalid amounts, prevent self-transfer, keep balances consistent, record failed attempts, and update both wallets inside one database transaction.

External Context

The evidence around AI coding tools is mixed. The GitHub Copilot productivity study reported faster completion on a bounded JavaScript HTTP-server task. A later METR study on experienced open-source developers found slower completion when AI tools were allowed in familiar repositories.

Stack Overflow's 2025 Developer Survey on AI reported that more developers distrusted the accuracy of AI output than trusted it. The DORA 2025 State of AI-assisted Software Development report also points to the same practical issue: tools do not replace the engineering system around them.

This project stays smaller. It does not ask whether AI is good or bad in general. It asks whether one Laravel service survives 16 explicit backend checks.

Project Shape

The project is called fake-senior-code-laravel-wallet. It is a runnable Laravel app with migrations, models, services, a policy, tests, a scoring helper, and an Artisan command.

The domain uses Laravel's default users table and two wallet tables:

wallets: one wallet per user, with a decimal balance.
wallet_transactions: one row for each successful or failed transfer attempt.

Both services expose the same method:

public function transfer(
    User $actor,
    int $senderId,
    int $receiverId,
    mixed $amount
): TransferResult;

That keeps the comparison narrow. Same input. Same expected behavior. Different implementation quality.

Wallet Contract

The benchmark does not model a full payment system. It checks the minimum backend rules a wallet transfer should obey.

Item	Rule
`actor`	Must be allowed to spend from the sender wallet.
`sender_id`	Must identify an existing wallet with enough balance.
`receiver_id`	Must identify an existing wallet and must not equal the sender.
`amount`	Must be numeric, greater than zero, and valid to two decimal places.
Balance update	Debit and credit must happen inside one database transaction.
Wallet rows	Rows should be locked before balance calculation.
Failure path	Failed attempts should be recorded with a reason.

The Two Services

FakeSeniorTransferService looks reasonable at first glance. It checks whether the amount is numeric. It rejects negative amounts. It loads sender and receiver wallets. It checks available balance. It updates balances. It records a successful transaction.

The missing parts are the issue:

No authorization check. An actor can spend from another user's wallet.
0.00 is treated as a successful transfer.
Self-transfer is allowed and can corrupt the balance.
No DB::transaction.
No lockForUpdate.
Most failed attempts are not written to wallet_transactions.

RealSeniorTransferService handles the same feature with explicit checks: amount normalization, self-transfer rejection, actor ownership, wallet policy, wallet existence checks, transaction wrapper, row locking, failed transaction rows, and warning logs.

Scenario Protocol

The command php artisan wallet:fake-senior-study creates fresh users and wallets for each case. It runs both services through the same scenario set and scores the result.

The measured scenario count is:

S = 16

The core scenarios are:

successful transfer
insufficient balance
transfer to self
zero amount
negative amount
missing sender wallet
missing receiver wallet
unauthorized transfer attempt
repeated transfer attempt
transaction history consistency
failure reason logging
balance unchanged after failed transfer

The remaining checks cover happy-path coverage, failure-path coverage, structured result object, and stable service API.

Scoring

The Production Readiness Score uses a weighted checklist:

PRS =
0.20 Validation
+ 0.20 Authorization
+ 0.20 Database Consistency
+ 0.15 Failure Handling
+ 0.15 Test Coverage
+ 0.10 Maintainability

Each category score is based on passed checks:

Category Score = Passed Checks / Total Checks x 100

Measured Results

The project was run locally with:

php artisan test
php artisan wallet:fake-senior-study --json

The test suite passed: 20 tests and 72 assertions.

Service	Passed	Failed	Score
Fake Senior Code	9	7	51.7 / 100
Real Senior Code	16	0	100 / 100

The core scenario result:

Scenario	Fake	Real
Happy path result	PASS	PASS
Transfer to self	FAIL	PASS
Zero amount	FAIL	PASS
Negative amount	PASS	PASS
Insufficient balance	PASS	PASS
Missing sender wallet	FAIL	PASS
Missing receiver wallet	FAIL	PASS
Unauthorized transfer attempt	FAIL	PASS
Repeated transfer attempt	PASS	PASS
Transaction history consistency	FAIL	PASS
Failure reason logging	FAIL	PASS
Balance unchanged on failed transfer	PASS	PASS

Why Some Scores Match

The fake service scored 100% in the measured database-consistency category. That does not make it safe under production traffic.

That category only checked sequential cases: balances after success, insufficient balance, repeated transfer, and failed transfer. The fake service passed those checks. It still lacks DB::transaction and lockForUpdate. This benchmark does not run parallel workers against the same sender wallet.

A stronger version should split that category into two checks:

sequential consistency
concurrency safety

Test coverage and maintainability also match because both services expose the same method and return TransferResult. That is the point. Fake senior code can have structure and still miss the rules that matter.

Reading the Numbers

The fake service passed 9 of 16 scenarios. The passes are real. The happy path works. Negative amounts are rejected. Basic insufficient-balance behavior leaves balances unchanged.

The failures matter more for shipping. Unauthorized transfers work. Zero-value transfers are accepted. Self-transfer can corrupt the wallet balance. Missing-wallet failures are not recorded. Failure reasons are not stored. The update path has no explicit transaction or row lock.

The real service passed all 16 scenarios. That does not make it a complete payment system. It means it satisfies the contract defined in this project.

Practical Notes

The benchmark uses SQLite. Check locking behavior and performance again on MySQL or PostgreSQL before using this as a production pattern.

The concurrency scenario is not a full concurrent load test. It verifies the real service has transaction and row-lock boundaries, and it checks repeated transfer behavior sequentially. A next version should run parallel requests against the same wallet.

A real wallet system should also add idempotency keys, immutable ledger rows, currency handling, retry behavior, reconciliation, and stronger audit rules.

Conclusion

The result is simple: the fake service passed 9 of 16 scenarios and scored 51.7. The real service passed 16 of 16 and scored 100.

The fake service is useful because it is not bad everywhere. It has structure. It passes simple checks. It fails when the test asks about ownership, invalid amounts, failure history, and database boundaries.

For AI-assisted backend work, do not stop at a service class and a success response. Define the production rules and test them. In this Laravel project, 16 scenarios were enough to show the difference.

For anyone who wants to dive deeper, I’ve attached the full study as a PDF here with all its details.

AI Did Not Replace Junior Developers. It Created Fake Senior Code.

AI Did Not Replace Junior Developers. It Created Fake Senior Code.

The Claim

External Context

Project Shape

Wallet Contract

The Two Services

Scenario Protocol

Scoring

Measured Results

Why Some Scores Match

Reading the Numbers

Practical Notes

Conclusion

Most Viewed Articles