Testing Data Imports: CSV, JSON, and Excel Files
How to thoroughly test data import functionality to catch errors before they affect users.
Why Import Testing Matters
Data import is often a critical feature for business applications. Users expect to upload spreadsheets, migrate data from other systems, and bulk update records. When imports fail or corrupt data, trust erodes quickly.
Thorough import testing catches issues like:
- Files that cause crashes or timeouts
- Data that gets corrupted during processing
- Edge cases that bypass validation
- Performance problems with large datasets
Preparing Test Data
Effective import testing requires diverse test files:
Standard Valid Data
- Normal records matching expected format
- Various record counts (10, 100, 1000, 10000)
- All supported data types
Edge Cases
- Empty files (just headers, no data)
- Single record files
- Maximum allowed records
- Files at or near size limits
Invalid Data
- Missing required fields
- Wrong data types (text in number fields)
- Values outside expected ranges
- Duplicate primary keys
- Malformed files
CSV Import Testing
CSV files appear simple but have many variations:
Format Variations
- Delimiters: Comma, semicolon, tab
- Line endings: Windows (CRLF), Unix (LF), old Mac (CR)
- Encoding: UTF-8, ISO-8859-1, Windows-1252
- Quoting: Quoted values, escaped quotes
Common CSV Issues to Test
- Commas within quoted fields
- Line breaks within quoted fields
- Double quotes within values
- Trailing commas
- Inconsistent column counts
- BOM (Byte Order Mark) characters
Example Problem Cases
# Comma in value
"John Smith","123 Main St, Apt 4","New York"
# Quote in value (escaped)
"Company ""ABC"" Corp","Description"
# Newline in value
"Line 1
Line 2","Regular field"JSON Import Testing
JSON imports can involve complex nested structures:
Structure Variations
- Array of objects (most common for bulk import)
- Single object with arrays
- Nested objects at multiple levels
- Mixed types in arrays
JSON-Specific Issues
- Invalid JSON syntax
- Missing closing brackets
- Trailing commas (valid in JavaScript, invalid in JSON)
- Comments (not allowed in JSON)
- Very deep nesting
- Circular references
Data Type Testing
- Null values vs. empty strings
- Integer vs. float numbers
- Very large numbers (precision limits)
- Boolean representations
- Date formats (JSON has no native date type)
Performance Testing
Import speed matters for user experience:
Scale Testing
- Small files (100 records) - baseline
- Medium files (10,000 records) - typical use
- Large files (100,000+ records) - stress test
Metrics to Track
- Time to complete import
- Memory usage during processing
- Database transaction size
- Progress feedback accuracy
Timeout Handling
- What happens if import takes too long?
- Can users cancel in progress?
- Is partial data committed or rolled back?
Error Handling Tests
Good error handling is as important as successful imports:
Error Scenarios
- Invalid file format entirely
- Valid format, some invalid rows
- File corrupted mid-import
- Network interruption during upload
- Insufficient permissions
Error Reporting Quality
- Are error messages user-friendly?
- Do they identify which rows failed?
- Is the specific issue explained?
- Can users download an error report?
Security Considerations
Import functionality can be a security risk:
- File type validation: Ensure only expected formats are accepted
- Size limits: Prevent denial-of-service via huge files
- Content validation: Check data, not just format
- Injection prevention: Sanitize imported data
Creating Test Files
For comprehensive testing, you need varied test files:
- Use data generators for realistic content
- Create files at different sizes
- Include international characters
- Generate both valid and invalid samples
- Document what each test file validates
Testing Checklist
- ☐ Valid file with standard data
- ☐ Empty file (headers only)
- ☐ Single record file
- ☐ Large file (at limit)
- ☐ File exceeding size limit
- ☐ Wrong file format
- ☐ Missing required columns
- ☐ Invalid data values
- ☐ Special characters
- ☐ Different encodings
- ☐ Duplicate records
- ☐ Concurrent imports
Conclusion
Import functionality touches database integrity, user experience, and security. Investing in thorough testing with diverse test files prevents data issues that are difficult to fix after the fact.
Build a library of test files covering normal cases, edge cases, and error conditions—then run them against every change to your import logic.