Test if AI can change your accounting
Many small and medium-sized retailers across Australia operate without the benefit of a dedicated accountant. It is an everyday reality, with most relying on periodic check-ins with external advisors rather than having in-house financial expertise. As a result, regular strategic analysis of financial reports can fall by the wayside, leaving business owners at a disadvantage when making timely, data-driven decisions. With the rapid advancement of AI financial analysis tools, a pressing question has emerged: Can artificial intelligence reliably analyse SMB financial reports and deliver the kind of actionable business insights that drive retail success?
It emerged in a recent meeting with several clients, many of whom look to us as leaders in AI technology for retail in our market space. A big question arose: How good is AI at analysing financial reports for SMB retailers? The reality is that most small retailers in Australia don't have a permanent accountant. Unlike large organisations with in-house finance teams, most SMBs only catch up with their accountant once a year, which puts them at a disadvantage regarding regular, data-driven business decisions. Many would like an accountant to review their figures and give some advice, but time and financial considerations are real.
To answer this question, we undertook a practical experiment to reflect Australian retailers' real-world challenges. The aim was to determine whether today's leading AI accounting tools could step in where regular financial analysis is often lacking, especially for businesses using platforms like MYOB or QuickBooks.
Could AI deliver the financial insight needed without an accountant by drawing on real-world testing and current best practices in retail financial management in SMB retailing?
Our Test Methodology
Here's how we structured the test to ensure it was fair, practical, and relevant to everyday retail operations:
Reports Provided:
We used real-world reports from MYOB or QuickBooks, which had errors and inconsistencies you'd expect from an SMB business without a trained bookkeeper. So we used their data, and we ran for two years.
Profit and Loss Statement
Balance Sheet
Trial Balance
Then put them into a single PDF
Our prompt
After refining our approach through multiple iterations, we developed a comprehensive prompt for the AI platforms.
==========================================================================================================
Company Performance Assessment Prompt
"I need you to provide a comprehensive business performance assessment for [Company Name] based on their financial statements for [Year 1] and [Year 2]. Your role is to help the company understand how they're performing and what they should focus on going forward.
DOCUMENTS PROVIDED IN PDF format:
- Profit & Loss Statement (2 years)
- Balance Sheet (2 years)
- Trial Balance (2 years)
Note: These documents may contain some errors or inconsistencies. Please flag any issues you notice, but focus primarily on providing valuable business insights about the company's performance and direction.
YOUR ASSESSMENT SHOULD COVER:
Overall Business Health
- How is the company performing financially?
- What are the key strengths and areas of concern?
- Is the business trending in the right direction?
Key Performance Highlights
- Revenue growth and sustainability
- Profitability trends and margins
- Cash flow and liquidity position
- Operational efficiency improvements or declines
Areas Requiring Attention
- What challenges is the company facing?
- Which metrics or trends are concerning?
- What risks should management be aware of?
Strategic Recommendations
- What should the company focus on to improve performance?
- What opportunities exist for growth or efficiency gains?
- What immediate actions should management consider?
Future Outlook
- Based on current trends, where is the company headed?
- What should they monitor closely in the future?
- What are the key success factors for continued growth?
DELIVERABLE: Write this as a business performance report that decision-makers in an SMB retail shop would find valuable for strategic planning and decision-making. Focus on practical insights that help them understand their business better and make informed decisions about the future.
If you notice any errors in the financial data, please mention them, but don't let data quality issues prevent you from providing meaningful business insights."
============================================================================================================================================
The test
The Free AI Landscape
Like so much in the world, not all AI tools are equal; some are better than others, and each has account limitations. So we tested six popular tools to the maximum the free version allowed.
Evaluation Criteria for AI Tool Performance
Now, we know that all of them are good, but there is always a case where even the best six runners in the world have one who is better, and that is what we wanted to find out: the best free AI for retailers.
When running the report, we put the best possible model on the highest free level.
Scoring the AI Performance
To objectively compare the AI-generated reports, we implemented a 100-point scoring system based on:
Overall Business Health: Did the AI assess the company's financial position accurately?
Trend Analysis: How well did it identify the direction of the company?
Key Performance Insights: Did it break down revenue, profitability, cash flow, and operational efficiency?
Problem and Risk Assessment: Could it pinpoint areas requiring attention and highlight potential risks?
Actionable Recommendations: Were the suggestions practical, relevant, and implementable?
Communication and Usability: Was the report easy to understand and follow?
Error Handling: Did the AI flag data inconsistencies and avoid being misled by errors?
Scoring Guidelines:
Now, we had a grading system that was
90–100: Exceptional Business Insight
80–89 Strong Business Analysis
70–79 Adequate Assessment
60–69 Poor Business Assessment
Below 60 Fail
Detailed results by AI
Claude
Score: 88/100 — Strong Business Analysis
Overall Business Health (23/25):
- Gave a clear, accurate summary: highlighted a 10.2% revenue decline and a 30.4% net income drop,
- Balanced strengths (identified a strong balance sheet with concerns on the company's shrinking revenue
Trend Analysis (9/10):
- Nailed the year-over-year trend, contextualising declines and highlighting the need for urgent action.
Key Performance Insights (27/30):
- Broke down revenue by segment
- Noted labour revenue decline (-15%)
- Analysed gross margins, which are up, and net margin, which was down
- Flagged cash and inventory issues
Problem & Risk Assessment (18/20):
- Prioritised risks: liquidity constraints, inventory overstock, and bad debts
- Could have expanded on market/competitive risks.
Strategic Recommendations (17/20):
- Practical advice: weekly cash flow monitoring, inventory optimisation, service diversification.
- It suggested digital transformation and market expansion, but it should have gone deeper into a long-term strategy.
Communication & Usability (3/5):
- Well-structured, logical, but dense—could be more concise for busy owners.
Error Handling:
- +3 bonus for flagging data quality issues without losing focus.
Summary:
Claude delivered a highly detailed, data-driven report with actionable advice, only let down by its length and slightly dense style.
Score: 85/100 — Strong Business Analysis
Overall Business Health (22/25):
- Clear on declining health
- Balanced positives (profitability) with concerns (liquidity, inventory).
Trend Analysis (9/10):
- Strong in historical context, less predictive about future trends.
Key Performance Insights (26/30):
- Detailed on revenue declines and margin shifts.
- Flagged inventory days (752 days in FY24) are a significant concern.
- Some working capital ratios were missed due to data presentation.
Problem & Risk Assessment (17/20):
- Identified key risks: revenue drop, inventory build-up, and bad debts.
- Could have ranked risks more sharply.
Strategic Recommendations (16/20):
- Practical calls for inventory reduction and credit control.
- Some recommendations (like scenario planning) were less actionable.
Communication & Usability (4/5):
- Clear, structured, and accessible.
Error Handling:
- +3 bonus for handling data inconsistencies without losing the thread.
Summary:
Google's report was well-organised, with strong practical recommendations and a solid structure, though its strategic vision could have been more forward-looking and detailed.
Grok
Score: 82/100 — Strong Business Analysis
Overall Business Health (21/25):
- Accurate on decline (revenue -10.2%, net income -30.4%), but less specific in some areas.
Trend Analysis (9/10):
- The downward trend was correctly identified, but the future outlook was generic.
Key Performance Insights (25/30):
- Covered revenue and margin declines, but not as deep on drivers.
- Noted liquidity and inventory issues, but lacked much information about working capital, something essential in SMB retailing.
Problem & Risk Assessment (16/20):
- Flagged revenue and liquidity risks, but prioritisation was vague.
Strategic Recommendations (16/20):
- Practical advice on cash flow and inventory, but lacked specificity.
Communication & Usability (4/5):
- Well-structured and clear.
Error Handling:
- +2 bonus for addressing data quality without overemphasis.
Summary:
Grok produced a solid, readable report with reasonable recommendations, though it lacked the detail and strategic sharpness seen in others.
DeepSeek
Score: 80/100 — Strong Business Analysis
Overall Business Health (20/25):
- Clear on decline (revenue -10.2%, net income -30.4%), noted equity growth (+10.9%).
Trend Analysis (8/10):
- A downward trend was identified, but the future outlook was general.
Key Performance Insights (24/30):
- Evaluated revenue decline, solid on margins, but less on specific drivers and operational ratios.
Problem & Risk Assessment (16/20):
- Flagged revenue and liquidity risks, but didn't rank them by impact.
Strategic Recommendations (16/20):
- Practical but broad suggestions.
Communication & Usability (4/5):
- Well-organised and clear.
Error Handling:
- +2 bonus for noting data inconsistencies without overemphasis.
Summary:
DeepSeek gave a firm overview and reasonable recommendations but lacked specificity and depth.
5. Qwen
Score: 75/100 — Adequate Assessment
Overall Business Health (18/25):
- Provided a basic overview but made a significant data error by misstating one department's growth.
Trend Analysis (8/10):
- A downward trend was noted, but the analysis was surface-level.
Key Performance Insights (22/30):
- Missed key revenue drivers and superficial operational analysis.
Problem & Risk Assessment (15/20):
- Identified issues but lacked nuance.
Strategic Recommendations (15/20):
- Practical, but generic and not innovative.
Communication & Usability (5/5):
- Clear and concise.
Error Handling:
- -5 penalty for data misinterpretation.
Summary:
Qwen's report was easy to read but was undermined by errors and a lack of depth.
6. ChatGPT
Score: 78/100 — Adequate Assessment
Overall Business Health (19/25):
- Summarised the decline, but used rounded numbers, which caused it to lack precision.
Trend Analysis (8/10):
- A downward trend was noted, but the analysis was fundamental.
Key Performance Insights (23/30):
- Reasonable on revenue and margins, but lacked detail and segment analysis.
Problem & Risk Assessment (16/20):
- Flagged liquidity and inventory risks but didn't assess the impact.
Strategic Recommendations (16/20):
- Practical, but generic.
Communication & Usability (4/5):
- Clear but slightly verbose.
Error Handling:
- No bonus or penalty.
Summary:
ChatGPT produced a basic, readable report with reasonable recommendations but lacked the depth and precision needed.
What This Means for SMB Retailers
The findings are encouraging. AI financial analysis tools have reached a level that can genuinely support SMB retailers in understanding their business performance. Even when presented with imperfect data, the best AI platforms can identify significant risks and opportunities and offer valuable insights.
The top performers, Claude and Google, delivered strong, actionable business analysis. These platforms identified critical risks, such as declining revenue and tightening cash flow, and offered practical recommendations that retailers could implement immediately.
Claude's report was the best. It provided clear summaries and highlighted both strengths and concerns. Its recommendations focused on weekly cash flow monitoring, inventory optimisation, and service diversification.
Google's analysis was well-structured and accessible. It balanced a clear-eyed view of business health with practical calls for inventory reduction and tighter credit control. While its strategic outlook could be more forward-looking, it handled inconsistencies in the data with confidence.
The others were adequate, and they all could do the job.
My Expert Commentary
As someone with deep experience in retail and POS systems, I believe these developments are a genuine opportunity for SMB retailers. AI tools do not substitute for expert advice; we did notice that not one AI produced what we would call an exceptional report; however, they significantly assist those lacking regular access to accountants. The capacity to conduct thorough financial analyses in minutes, utilising data exported directly from accounting software can revolutionises strategic planning and daily operations.
This AI-driven financial analysis empowers retailers to identify emerging trends before they escalate into critical issues, help make informed decisions regarding pricing, inventory, and supplier management, identify areas for improving operational efficiency and to concentrate on the most essential metrics: cash flow, profit margins, and stock turnover.
My Professional Recommendation
I strongly recommend that every SMB retailer in Australia conduct a serious financial analysis at least once a quarter using these AI tools. Here's how you can do it:
- Export your financials (P&L, Balance Sheet, Trial Balance) from your accounting software.
- Feed them into a top-rated AI; I suggest Claude or Google with the prompt above.
- Review the AI's report for trends, risks, and recommendations.
- Act on the insights
- Repeat this analysis at least quarterly. Regular reviews can let you adapt quickly to changing business conditions.
If you need help setting this up, reach out for a chat. If there is enough interest, I will do a webinar.
Written by:
Bernard Zimmermann is the founding director at POS Solutions, a leading point-of-sale system company with 45 years of industry experience. He consults to various organisations, from small businesses to large retailers and government institutions. Bernard is passionate about helping companies optimise their operations through innovative POS technology and enabling seamless customer experiences through effective software solutions.