Batch Sizes
2008-04-11 02:16:08-0400 - processes (1),production (1),coding (1) - 20 comments
My girlfriend is into Operations Research. Like any boyfriend, I can’t help but get somewhat interested in what she’s doing. Over the summer, she let me read The Goal. Other than the fact that about half way through it got very repetitive, it was a good read. But they were trying to teach us something. (Repetitio est mater studiorum!)
In O.R. (Operations Research), one type of thing you learn a lot about a lot is the factory. Factories are complicated entities, what with all the complicated dependency chains, asynchronous processing, multiple failure points, and deadlines just to mention a few of the problems. I learned about this buzzword-process called the Theory of Constraints about six months ago, and have been pondering about the importance of factory analysis ever since.
Fast forward to today. At the company I work for, we have a problem with processing of logs. It’s complicated. There are at least 4 different sets of rules with their own systems. Every night every single entry in our >50 million rows of entries need to be scanned and analyzed. This isn’t the difficult part, though. We recently moved to a second (and third) data center. So of course, our problems became exponentially worse. Now, I had to start moving computation around (boy do I wish I had the infrastructure for Hadoop/MapReduce). Things start failing more and more often. (Hard disk error, network failure, etc.) How do we make this system more robust?
I made a realization: We’re undergoing the same set of problems that factories undergo. A small failure in one spot propagates and ruins the rest of the system. Our initial approach (try to make everything “work” all the time—I know, naïve) is just staring to crumble. After trying to make errors more visible and making us respond earlier, I finally realized what the problem was:
Our batch sizes are too large.
- Maybe this is common sense, but I don’t think it is for most people. After all, this is a very fundamental realization in Operations Management (for stick figures, read the summary section of this page). Currently, we do all of the processing nightly. This is nice for a couple reasons:
- Simple (thus less prone to breakage)
- Processing runs during hours when the machines aren’t under load [1].
However, there’s one this this is not good for: recoverability [2]. As any system grows—as more and more dependencies get added on—the mean time between some failure will reduce. This is a given. How we handle those failures is what matters. Do we let it propagate to the end user? Do we force it to repetitively wake up our sysadmin every night at 2am? I believe we can answer both of these with a resounding NO. We should build the system to expect failure. To do this, though, we have to reduce the batch sizes. So my new system will contain two major changes:
I. Cut batch sizes from day-long to hour-long
- I think this is obvious. If we need to recover, we need to have time to respond. Lowering batch sizes will help for at least three reasons:
- The time between when the end user sees the information and when the processing starts is increased to 12 hours on average.
- Any failure results in a worst-case run of 25 minutes worth of log processing, instead of 5 hours.
- During the 8-hour business day, there are 8 chances to get notified of code flakiness.
II. Allow the system to be resilient to failure
- This is a little tricky. The basic idea behind this that whenever pieces of the system fail:
- Notify a human of the error.
- Move on (eventual consistency)
- Make it easy (or automatic) to fill in the gaps in subsequent processes
- When are reports are end-user facing, verify consistency
I think this is obvious, but I think the prospect of adding complexity is scary (rightfully so). In this particular case, I think the benefits are too great. Also, from looking around I think most other companies do hourly processing of data, but that’s based on a very quick survey.
| [1] | We don’t really take advantage of this, for fear of complexity. |
| [2] | I know I made that word up. Is nounification really a crime? |






Comments on This Post:
Posted by yalu
on April 14, 2008
Reply To This Comment
Posted by Usesimmigue
on Feb. 04, 2011
The product pet supply grouping includes nurture and supplies for pet supply the sake of horses, dogs, cats, wild birds, rabbits, poultry, pet supply calves, deer and minuscule animals, as extravagantly as fencing, pet supply do setting-up exercises clothing and boots, and ill-defined farm supplies. Ranch & Particular Stock also has a customized advertising program, store planning services, stock signage and decor, and other services to arrogate retailers broaden sales and profits http://www.lovelonglong.com Would like to unsubscribe from this business, please send your URL to the E-mail: unsubscribe@1yingxiao.com to unsubscribe
Reply To This Comment
Posted by speeeed
on Feb. 04, 2011
Hi, regards all :) Filmy
Reply To This Comment
Posted by roofernorfolkxx
on April 30, 2011
This is a very interesting site. The content is very informative and I am so glad that I dropped by. Thanks!
Virginia Beach Roofing
Reply To This Comment
Posted by Abnodomeamb
on May 04, 2011
http://cleotildewiss.tennerblog.com/ 5
Reply To This Comment
Posted by Adrien
on May 19, 2011
You got great points there, that's why I always love checking out your blog.
My blog:
meilleur taux aussi societe Rachat de Credit
Reply To This Comment
Posted by virginiabeachroofingvvbc
on June 14, 2011
This Looks like awesome forum or is this a blog? Sorry I am a newbie.
Virginia Beach Roofing
Reply To This Comment
Posted by RoofingDayton
on June 21, 2011
This Looks like awesome forum or is this a blog? Sorry I am a newbie.
Dayton Roofing
Reply To This Comment
Posted by prureveadia
on June 23, 2011
Hello there , searching for the best nearly insightful penny stock news letter? .
FREEZING have one simply take check out I try and throw the best and bracing data, so if nevertheless this is of involvement to you personally please check out my penny stock newsletter.
Reply To This Comment
Posted by axonundudge
on June 27, 2011
Visit us now to see more information and facts regarding to
odzywki Universal Nutrition
Reply To This Comment
Posted by Andre
on July 07, 2011
Vous avez de bons points il, c'est pourquoi j'aime toujours verifier votre blog, Il semble que vous etes un expert dans ce domaine. maintenir le bon travail, Mon ami recommander votre blog.
Mon francais n'est pas tres bon, je suis de l'Allemagne.
Mon blog: courtier en credit et organisme de rachat de credit
Reply To This Comment
Posted by Rooferguyddcxxx
on July 15, 2011
Wow What A Great Site! I love the content.
Dayton Roofers
Dayton Roofing
Reply To This Comment
Posted by godroofermanxxy
on July 19, 2011
I found this website and I must say that it looks wonderful and I am glad to be a part of this community.
Dayton Roofers
Reply To This Comment
Posted by BUTUNINOFUM
on Aug. 07, 2011
A shared trap hosting advantage or effective hosting utilization or receive assemblage refers to a web hosting checking where myriad websites reside on harmonious snare server connected to the Internet. Each situate "sits" on its own break-up, or section/place on the server, to keep it discriminate from other sites. This is customarily the most thrifty option owing hosting, as diverse people allocation the inclusive bring in of server maintenance. hosting
Reply To This Comment
Posted by optoplasp
on Aug. 18, 2011
Hi
What do you thing about below diet supplement? I'm going to buy something good for muscle growth. Please give me a piece of advice.
Jak zadbaæ o siebie?
Reply To This Comment
Posted by Edideoday
on Sept. 07, 2011
A shared snare hosting service or effective hosting worship army or derive publican refers to a network hosting checking where myriad websites reside on joke trap server connected to the Internet. Each site "sits" on its own allotment, or section/place on the server, to keep it separate from other sites. This is customarily the most stingy choice on account of hosting, as numerous people apportionment the entire bring in of server maintenance.
tani hosting
Reply To This Comment
Posted by higaloek
on Sept. 27, 2011
sports betting champ p8ss0kgzyu5fxsbscuuz
Reply To This Comment
Posted by GlallVoks
on Oct. 04, 2011
Get in our website and read more about
Dlaczego s¹ Ci potrzebne suplementy diety
Reply To This Comment
Posted by jerxruvedu
on Nov. 27, 2011
Louis Vuitton and ugg boots on sale authorized: louis vuitton replica purses
Consult Hadley: Hadley Freeman for the fey and also weak design glimpse in addition to Balmain jacket | Life and elegance
ugg boots: http://www.ukbootser.com.
Why implement it many female styles around sets appearance like they may be anxious for this loo? Or perhaps it really fashionable to look fey plus weak?
Reply To This Comment
Posted by penisadvantage
on Jan. 15, 2012
penis advantage reviews
Reply To This Comment