Troubleshooting Database Woes: File Corrupted

Ever had a day where things just didn't click? Our data world had one of those today when our trusty petl tool hit a snag. petl is an internal tool that we use to execute and schedule the data transformations we create in Pentaho.

Imagine waking up to find out that our overnight data magic tool failed, all thanks to a pesky database error. In this post, we'll share the mystery, the logs, and how we cracked the case.

The Problem Unveiled: IllegalStateException: File corrupted in chunk 526526, expected page length 4..32, got 538976288 [1.4.199/6]

So, petl didn't dance to the data tune like it usually did. The logs were shouting in computer language we didn't fully get:

2023-12-04 08:07:26.164  WARN 958 --- [eduler_Worker-4] o.h.engine.jdbc.spi.SqlExceptionHelper   : SQL Error: 50000, SQLState: HY000
2023-12-04 08:07:26.164 ERROR 958 --- [eduler_Worker-4] o.h.engine.jdbc.spi.SqlExceptionHelper   : General error: "java.lang.IllegalStateException: File corrupted in chunk 526526, expected page length 4..32, got 538976288 [1.4.199/6]"; SQL statement:
select jobexecuti0_.uuid as uuid1_0_, jobexecuti0_.completed as complete2_0_, jobexecuti0_.config as config3_0_, jobexecuti0_.description as descript4_0_, jobexecuti0_.error_message as error_me5_0_, jobexecuti0_.initiated as initiate6_0_, jobexecuti0_.job_path as job_path7_0_, jobexecuti0_.parent_execution_uuid as parent_e8_0_, jobexecuti0_.sequence_num as sequence9_0_, jobexecuti0_.started as started10_0_, jobexecuti0_.status as status11_0_ from petl_job_execution jobexecuti0_ where jobexecuti0_.job_path=? order by jobexecuti0_.started desc [50000-199]
2023-12-04 08:07:26.166 ERROR 958 --- [eduler_Worker-4] org.pih.petl.api.ScheduledExecutionTask  : An error occured while executing a scheduled job.  Aborting all remaining jobs.

Translation: Something went wrong, files got mixed up, and our scheduled jobs were not happy.

The Investigation: One Hour of Head-Scratching

We tried playing detective. Checked recent changes, updates, and even our morning coffee intake – nada. The error had us stumped.

The Lightbulb Moment: Deleting the Trouble-Makers

In a moment of "let's try this," we deleted the petl.mv.db and petl.trace.db files (the database sidekicks). petl.mv.db is the default internal H2 database used for metadata, logins, settings, and query history while petl.trace.db is a log used for inspecting and debugging H2. Restarted the system, crossed our fingers, and voila – petl woke up, ready for action!

So, if your data day feels a bit off, remember our story. Errors happen, but so do solutions. Stay curious, keep it simple, and happy troubleshooting!