On Friday the 27th of June, we experienced about 90 minutes of unexpected downtime to fix an error in the way that transaction identifiers from our live feed data provider were being stored. After the fixes were applied, the live feed service was down for an additional 12 hours while transaction ID integrity was verified, and repaired where necessary.
An initial report of transactions missing for recent days arrived on the afternoon of the 26th of June, NZST. This was followed by more reports the next day, and we started investigating the underlying cause for transactions not appearing.
We found that our bank feed data provider, Yodlee had recently hit the upper limit of a 32-bit integer, 2147483647, for their internal ID’s. If a transaction ID from Yodlee featured a number larger than this, the number would be “truncated” before being stored in the standard “integer” column that PocketSmith used. This truncation meant that every ID that should be above 2147483647 would be set to 2147483647.
This is a fairly big deal in PocketSmith. The unique transaction ID from Yodlee is how we track what transactions have been retrieved already, allowing us to remove old pending transactions and avoid duplicates. Dates, amounts and payee names change too much to be relied upon; this unique transaction ID is key. Transactions started to be missing, because PocketSmith thought that they’d already been retrieved - they had an ID of 2147483647, after all.
The initial fix went off without a hitch over the space of 90 minutes on Friday afternoon. However we soon realised that there were a large number of transactions within PocketSmith that featured the fateful number as their ID. If an affected user synced with the live bank feed service with these transaction IDs in place, they would likely lose categorisation for all affected transactions.
Following the initial core fixes, it was decided at 2.30pm Friday that we’d bring the main PocketSmith application back online, but leave the live bank feeds offline. We didn’t want anyone to lose the categorisation on their transactions, and work started on figuring out how to get the more-than-2147483647 ID’s in place.
The fixes took an additional 12 hours, due to the delicate nature of the exercise. Correct transaction IDs were restored without affecting categorisation or any other data, and live feeds were switched back on at 2.40am Saturday.
This issue will not occur again due to the updates to how we store transaction identifiers. Live feed downtime would have also been greatly reduced had some planned features been released (including http://psmth.to/S8), so we’re looking at shuffling some roadmap items as a result. We’ve also scheduled a review of both how we manage duplicates Yodlee ID’s, to see if we can put further checks in place.
Application downtime is something we avoid if possible; unfortunately the deep changes required on Friday made it inevitable in this case. We’re sorry for the inconvenience caused by the initial downtime, and also for live feeds being out-of-action for so long.
Thanks for your patience and support while we got all the above sorted out. Please let us know if you have any questions, or notice anything out of the ordinary from here and we’ll be happy to assist.