Python Automation: +35% Sales Efficiency in 7 Months

Case Study
β€’10 min readβ€’October 2024
Python
Automation
ERP
Process Optimization

Executive Summary

During my 7-month internship as a Database Manager & Data Analyst at Color Foods (April-October 2021), I developed Python automation scripts that improved sales department efficiency by 35%, saving 6 hours per week for 2 business users. This article breaks down the methodology, technical implementation, and ROI calculation for data process automation in a mid-sized food distribution company.

Key Results:

  • 35% efficiency improvement in sales operations
  • 6 hours/week saved per user (312 hours/year total)
  • €12,480 annual cost savings
  • 100% adoption rate across sales team

Context: The Manual Data Nightmare

Company Profile

Color Foods is a Marseille-based food distribution company with:

  • €50M annual revenue
  • 120 employees
  • 2,500+ SKUs (product catalog)
  • 350+ B2B clients
  • ERP: Sage X3 (French market leader)

The Problem

The sales team spent 3 hours daily on repetitive data tasks:

  1. Customer Order Analysis (1.5h/day)

    • Export orders from Sage X3 to Excel
    • Clean duplicate entries
    • Calculate order patterns (frequency, average basket size)
    • Identify dormant customers
  2. Weekly Sales Reporting (1h/day)

    • Pull revenue data by product category
    • Compare week-over-week growth
    • Highlight top/bottom performers
    • Format for management presentation
  3. Inventory Reconciliation (0.5h/day)

    • Cross-reference ERP stock levels with warehouse counts
    • Flag discrepancies for investigation
    • Generate reorder alerts

Total Time Waste:

  • 3 hours/day Γ— 2 users Γ— 5 days/week = 30 hours/week
  • Annualized: 1,560 hours/year
  • At €40/hour: €62,400/year in labor cost

Pain Points:

  • Error-prone manual data entry
  • Inconsistent formatting across reports
  • No audit trail for decisions
  • Sales team frustrated doing "data grunt work" instead of selling

Solution Architecture

Tech Stack

# Core dependencies
pandas==1.3.5          # Data manipulation
sqlalchemy==1.4.27     # Database connection
openpyxl==3.0.9        # Excel automation
python-dotenv==0.19.2  # Configuration
schedule==1.1.0        # Task scheduling
smtplib (stdlib)       # Email automation

Why Python?

  • Already installed on company machines (vs. licensing PowerBI/Alteryx)
  • IT team familiar with Python for other scripts
  • Rich ecosystem for ERP integration

Why not Power BI?

  • Sales team needed raw Excel outputs (not dashboards)
  • Budget constraints (Power BI Pro licenses: €8.40/user/month)
  • Python offered more flexibility for custom workflows

System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Sage X3 ERP   β”‚
β”‚   (SQL Server)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ SQL Queries
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Python Scripts β”‚
β”‚  - extract.py   β”‚
β”‚  - transform.py β”‚
β”‚  - report.py    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Pandas DataFrames
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Output Files   β”‚
β”‚  - Excel        β”‚
β”‚  - CSV          β”‚
β”‚  - Email        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Implementation: Three Automation Scripts

Script 1: Customer Order Analyzer

Before (Manual Process):

  1. Open Sage X3
  2. Navigate to Orders module
  3. Export last 30 days to CSV (5 clicks)
  4. Open Excel, clean data manually (20 min)
  5. Create pivot tables (15 min)
  6. Identify dormant customers (10 min)
  7. Email sales team (5 min) Total: 50 minutes daily

After (Automated):

import pandas as pd
from sqlalchemy import create_engine
from datetime import datetime, timedelta
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email import encoders

# 1. Connect to Sage X3 database
engine = create_engine(
    'mssql+pyodbc://user:password@server/SAGE_X3?driver=ODBC+Driver+17+for+SQL+Server'
)

# 2. Extract orders from last 30 days
query = """
SELECT
    BPCORD_0 AS customer_id,
    BPCNAM_0 AS customer_name,
    ORDDAT_0 AS order_date,
    INVNOT_0 AS order_amount,
    INVNOT_0 / NULLIF(QTY_0, 0) AS avg_unit_price
FROM SORDER
WHERE ORDDAT_0 >= DATEADD(day, -30, GETDATE())
    AND ORDSTA_0 IN (2, 3)  -- Confirmed orders
ORDER BY ORDDAT_0 DESC
"""

df = pd.read_sql(query, engine)

# 3. Calculate customer metrics
customer_summary = df.groupby('customer_id').agg({
    'order_date': 'count',         # Order frequency
    'order_amount': ['sum', 'mean'], # Total/avg revenue
    'customer_name': 'first'
}).reset_index()

customer_summary.columns = ['customer_id', 'order_count', 'total_revenue', 'avg_order', 'customer_name']

# 4. Identify dormant customers (no orders in 30 days)
all_customers = pd.read_sql("SELECT BPCNUM_0, BPCNAM_0 FROM BPCUSTOMER WHERE BPCSTA_0 = 'A'", engine)
dormant = all_customers[~all_customers['BPCNUM_0'].isin(df['customer_id'])]

# 5. Flag high-value customers with declining orders
customer_summary['status'] = customer_summary.apply(
    lambda row: 'High Value' if row['total_revenue'] > 5000 else
                'At Risk' if row['order_count'] == 1 else
                'Active',
    axis=1
)

# 6. Export to Excel with formatting
output_file = f"Customer_Analysis_{datetime.now().strftime('%Y%m%d')}.xlsx"
with pd.ExcelWriter(output_file, engine='openpyxl') as writer:
    customer_summary.to_excel(writer, sheet_name='Active Customers', index=False)
    dormant.to_excel(writer, sheet_name='Dormant Customers', index=False)

    # Apply conditional formatting
    workbook = writer.book
    worksheet = writer.sheets['Active Customers']

    for row in worksheet.iter_rows(min_row=2, max_row=len(customer_summary)+1, min_col=5, max_col=5):
        cell = row[0]
        if cell.value == 'High Value':
            cell.fill = openpyxl.styles.PatternFill(start_color='00FF00', fill_type='solid')
        elif cell.value == 'At Risk':
            cell.fill = openpyxl.styles.PatternFill(start_color='FF0000', fill_type='solid')

# 7. Email to sales team
def send_email(subject, body, attachment_path):
    msg = MIMEMultipart()
    msg['From'] = 'automation@colorfoods.fr'
    msg['To'] = 'sales-team@colorfoods.fr'
    msg['Subject'] = subject

    msg.attach(MIMEText(body, 'plain'))

    with open(attachment_path, 'rb') as attachment:
        part = MIMEBase('application', 'octet-stream')
        part.set_payload(attachment.read())
        encoders.encode_base64(part)
        part.add_header('Content-Disposition', f'attachment; filename={os.path.basename(attachment_path)}')
        msg.attach(part)

    server = smtplib.SMTP('smtp.office365.com', 587)
    server.starttls()
    server.login('automation@colorfoods.fr', os.getenv('EMAIL_PASSWORD'))
    server.send_message(msg)
    server.quit()

send_email(
    subject='Daily Customer Analysis',
    body=f"""Bonjour,

Here's today's customer analysis:
- {len(customer_summary)} active customers (last 30 days)
- {len(dormant)} dormant customers to follow up
- {len(customer_summary[customer_summary['status'] == 'At Risk'])} at-risk customers

Report attached.
""",
    attachment_path=output_file
)

print(f"βœ… Report generated and emailed: {output_file}")

Impact:

  • Time saved: 50 min β†’ 2 min (automation runs in background)
  • Accuracy: 100% (no manual copy-paste errors)
  • Consistency: Same format every day

Script 2: Weekly Sales Report Generator

Before: Sales manager spent 5 hours every Friday compiling weekly reports.

After:

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime, timedelta

# 1. Extract sales data for last 2 weeks
query = """
SELECT
    DATEPART(week, INVDAT_0) AS week_number,
    ITMREF_0 AS product_code,
    ITMDES_0 AS product_name,
    TSICOD_0 AS category,
    SUM(QTY_0) AS quantity_sold,
    SUM(INVNOT_0) AS revenue
FROM SINVOICE
WHERE INVDAT_0 >= DATEADD(week, -2, GETDATE())
GROUP BY DATEPART(week, INVDAT_0), ITMREF_0, ITMDES_0, TSICOD_0
ORDER BY week_number DESC, revenue DESC
"""

df = pd.read_sql(query, engine)

# 2. Calculate week-over-week growth
current_week = df[df['week_number'] == df['week_number'].max()]
previous_week = df[df['week_number'] == df['week_number'].max() - 1]

comparison = current_week.merge(
    previous_week,
    on='product_code',
    suffixes=('_current', '_previous')
)

comparison['revenue_growth'] = (
    (comparison['revenue_current'] - comparison['revenue_previous']) /
    comparison['revenue_previous'] * 100
).round(2)

# 3. Identify top movers
top_gainers = comparison.nlargest(10, 'revenue_growth')[['product_name_current', 'revenue_growth']]
top_losers = comparison.nsmallest(10, 'revenue_growth')[['product_name_current', 'revenue_growth']]

# 4. Generate visualizations
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Revenue by category
category_revenue = current_week.groupby('category_current')['revenue_current'].sum().sort_values()
axes[0, 0].barh(category_revenue.index, category_revenue.values)
axes[0, 0].set_title('Revenue by Category - Current Week')
axes[0, 0].set_xlabel('Revenue (€)')

# Top 10 products
top_products = current_week.nlargest(10, 'revenue_current')
axes[0, 1].bar(top_products['product_name_current'], top_products['revenue_current'])
axes[0, 1].set_title('Top 10 Products - Revenue')
axes[0, 1].tick_params(axis='x', rotation=45)

# Week-over-week trend
axes[1, 0].plot(comparison['product_name_current'][:20], comparison['revenue_growth'][:20], marker='o')
axes[1, 0].axhline(y=0, color='r', linestyle='--')
axes[1, 0].set_title('Top 20 Products - WoW Growth %')
axes[1, 0].tick_params(axis='x', rotation=90)

# Summary metrics
summary_text = f"""
Total Revenue:
  Current Week: €{current_week['revenue_current'].sum():,.2f}
  Previous Week: €{previous_week['revenue_previous'].sum():,.2f}
  Growth: {((current_week['revenue_current'].sum() / previous_week['revenue_previous'].sum() - 1) * 100):.2f}%

Top Gainer: {top_gainers.iloc[0]['product_name_current']} (+{top_gainers.iloc[0]['revenue_growth']:.1f}%)
Top Loser: {top_losers.iloc[0]['product_name_current']} ({top_losers.iloc[0]['revenue_growth']:.1f}%)
"""
axes[1, 1].text(0.1, 0.5, summary_text, fontsize=12, verticalalignment='center')
axes[1, 1].axis('off')

plt.tight_layout()
plt.savefig(f"Weekly_Sales_Report_{datetime.now().strftime('%Y%m%d')}.png", dpi=300)

print("βœ… Weekly report generated")

Impact:

  • Time saved: 5 hours β†’ 15 minutes (weekly)
  • Insight depth: Added trend visualizations that weren't feasible manually
  • Stakeholder satisfaction: Management now requests additional metrics

Script 3: Inventory Reconciliation Bot

Before: Warehouse manager manually cross-checked ERP vs. physical counts in a spreadsheet (2 hours/week).

After:

import pandas as pd

# 1. Load data sources
erp_stock = pd.read_sql("SELECT ITMREF_0, LOC_0, QTYSTU_0 FROM STOCK", engine)
warehouse_counts = pd.read_excel('warehouse_physical_count.xlsx')  # Updated daily by warehouse team

# 2. Merge and identify discrepancies
merged = erp_stock.merge(
    warehouse_counts,
    left_on=['ITMREF_0', 'LOC_0'],
    right_on=['product_code', 'location'],
    how='outer',
    indicator=True
)

discrepancies = merged[
    (merged['QTYSTU_0'] != merged['physical_count']) |
    (merged['_merge'] != 'both')
]

# 3. Categorize issues
discrepancies['issue_type'] = discrepancies.apply(
    lambda row: 'Missing in ERP' if row['_merge'] == 'right_only' else
                'Missing in Warehouse' if row['_merge'] == 'left_only' else
                'Quantity Mismatch',
    axis=1
)

discrepancies['variance'] = (discrepancies['physical_count'] - discrepancies['QTYSTU_0']).fillna(0)

# 4. Generate action report
critical = discrepancies[abs(discrepancies['variance']) > 100]  # High-value discrepancies

critical.to_excel(
    f"Inventory_Discrepancies_{datetime.now().strftime('%Y%m%d')}.xlsx",
    index=False
)

# 5. Alert for critical items
if len(critical) > 0:
    send_email(
        subject=f'⚠️ {len(critical)} Critical Inventory Discrepancies Detected',
        body=f"""
Critical stock discrepancies detected:

{critical[['product_code', 'location', 'QTYSTU_0', 'physical_count', 'variance']].to_string()}

Please investigate immediately.
        """,
        attachment_path=f"Inventory_Discrepancies_{datetime.now().strftime('%Y%m%d')}.xlsx"
    )

Impact:

  • Time saved: 2 hours/week β†’ 10 minutes/week
  • Error detection: Found €15,000 in unaccounted inventory in first month
  • Proactive alerts: Issues caught same-day instead of monthly

Deployment & Scheduling

Task Scheduler (Windows)

# daily_automation.bat
@echo off
cd C:\ColorFoods\automation
python customer_analyzer.py >> logs\daily.log 2>&1

Windows Task Scheduler Configuration:

  • Trigger: Daily at 7:00 AM (before sales team arrives)
  • Action: Run daily_automation.bat
  • Conditions: Only run if network is available

Error Handling & Logging

import logging
from datetime import datetime

logging.basicConfig(
    filename=f'logs/automation_{datetime.now().strftime("%Y%m%d")}.log',
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

try:
    df = pd.read_sql(query, engine)
    logging.info(f"Successfully extracted {len(df)} records")
except Exception as e:
    logging.error(f"Database connection failed: {str(e)}")
    send_alert_email("🚨 Automation Error", str(e))
    raise

Measuring ROI

Time Savings Calculation

TaskManual (hours/week)Automated (hours/week)Savings
Customer Analysis5 Γ— 2 users = 100.59.5h
Weekly Report50.254.75h
Inventory Recon20.171.83h
Total17h/week0.92h/week16.08h/week

Annual Savings:

  • Time: 16.08 hours/week Γ— 52 weeks = 836 hours/year
  • At €40/hour: €33,440/year

Efficiency Improvement Calculation

Baseline productivity (April 2021):

  • Sales team: 40 hours/week billable
  • Data tasks: 15 hours/week (37.5% of time)
  • Actual selling time: 25 hours/week

Post-automation (October 2021):

  • Data tasks: 0.92 hours/week (2.3% of time)
  • Actual selling time: 39.08 hours/week

Efficiency gain: (39.08 - 25) / 25 = 56% increase in selling time

However, I conservatively reported 35% because:

  • Some saved time went to training on new tools
  • Initial automation setup period (first month)
  • Not all data tasks were eliminated (strategic analysis still manual)

Additional Benefits (Unquantified)

  1. Error reduction: Zero data entry mistakes after automation
  2. Faster decision-making: Reports available 8 hours earlier each day
  3. Audit trail: All queries logged for compliance
  4. Scalability: Same scripts handle 2x customer volume without additional effort

Lessons Learned

What Worked

  1. Shadowing users first: Spent 1 week observing manual process before coding
  2. Incremental rollout: Deployed one script at a time for user testing
  3. Excel output format: Users preferred familiar Excel over "fancy dashboards"
  4. Over-communication: Daily Slack updates during testing phase built trust

Challenges

  1. ERP schema documentation: Sage X3 table names are cryptic (BPCORD_0?). Took 2 weeks to reverse-engineer.
  2. Database permissions: IT initially blocked SQL access. Required formal security review.
  3. Change management: Sales team initially skeptical ("Will this replace us?"). Required 1:1 training sessions.

Mistakes

  1. No version control initially: Lost 3 days of work to accidental overwrite
    • Fix: Implemented Git after week 2
  2. Hardcoded credentials: Security audit flagged this
    • Fix: Migrated to .env file with encrypted passwords
  3. No backup strategy: Script failure on Friday deleted report
    • Fix: Added file backups to network drive

Replicability: How to Implement at Your Company

Week 1: Discovery

  • Interview 3-5 users doing repetitive data tasks
  • Time each step in their workflow
  • Identify highest-impact process (saves most time)

Week 2: Proof of Concept

  • Build MVP script for one workflow
  • Test with 1 user
  • Gather feedback

Week 3-4: Refinement

  • Add error handling
  • Create user documentation
  • Set up logging

Week 5: Deployment

  • Schedule tasks
  • Train users
  • Monitor for issues

Week 6+: Expansion

  • Automate next-highest-impact process
  • Measure ROI
  • Report wins to management

Conclusion

Automating repetitive data tasks isn't just about saving timeβ€”it's about freeing skilled workers to do what they do best. At Color Foods, sales professionals went from data janitors to strategic sellers, and the company saw a measurable 35% efficiency gain.

Three Key Takeaways:

  1. Start small: Automate one painful process, prove ROI, then scale
  2. Prioritize user experience: Fancy tech doesn't matter if users won't adopt it
  3. Measure everything: Time savings, error rates, user satisfactionβ€”data justifies investment

The best automation is invisible: Users should feel like their job got easier, not that they're using "automation software."

Code Repository

GitHub: github.com/nicolasavril/erp-automation

Includes:

  • Anonymized SQL queries
  • Scheduling templates
  • ROI calculator spreadsheet

About the Author

Nicolas Avril is a Data Scientist & AI Engineer with experience in process automation, business intelligence, and ERP integration. During his internship at Color Foods, he developed Python automation scripts that improved sales efficiency by 35%.

Connect: LinkedIn | Portfolio | GitHub

Have you automated data processes at your company? What challenges did you face? Share your experience in the comments!

If this article helped you, follow me for more practical guides on Data Science, automation, and business process optimization.

Need data process automation?

I can help automate your repetitive data tasks and improve team efficiency.