Zebra Coffee Inventors Hair Windows Mobile Armed Security Guard Terrorism Advertising Signs Atomic Clock Massage Chair Best Buys Navy Seals Osteoporosis Hunter Ceiling Fans Cheap Lcd Tv Forest Park Gold Wedding Rings Atomic Clocks Cuisinart Ice Cream Makers Extra Money Freedom Used Audi Tt Arthritis Relief Sliding Doors Find Personal Trainer Cheap Coffee Grinder All Inclusive Vacation Packages Gap Kids Cpr Training Easy Money Local Dentist Mlt Vacations Carbon Monoxide Alarm Make Money Typing Anxiety Avg Diamond Wedding Rings Cpr Wholesale Vitamins Treadmill Review Granite Contractor Bedroom Furniture Store Buy Dvd Movies Unique Shower Invitations Unique Gift Telephone Baby Modeling Friend Best Skin Care Def Leppard Tickets Suv Tires True Com Las Vegas Show Free Blog Ford Tractor Parts Custom Baby Shower Invitations North Myrtle Beach Rentals Scion Cheap Condom Corner Tv Stands Kids Play Furniture Vacation Aa Batteries Mattress Personalized Baby Shower Invitations Home Search Medicine Maui Numbered Poker Chips Doors Hotel Reservations Top 10 Home Based Business Car Auctions Online Online Business Opportunity Karaoke Cds Diamonds Online Car Auctions Desktop Pc Lcd Plasma Tv Do It Yourself Credit Repair Birth Control Pill Contemporary Home Furniture Pc Carnival Cruise Discount Used Laptop Granite Kitchen Countertop Wood Jewelry Boxes Minimum Wage Video Cameras Windows Media Center Postpartum Depression Dentist Office Jack Rabbit Vibrators Office Furniture Warehouse Couple Baby Shower Invitations Encyclopedia Spanish Translator Electric Treadmill Contemporary Home Office Furniture How To Lose Weight Fast All Inclusive Family Vacations Life Is Good T Shirt All Inclusive Vacation Refurbished Laptop Home Computer Desks Rabbit Vibrators Wireless Internet Connection Jackson Ideal Diamonds Lose Weight Quick Buy Dvd Online Calculator Web Messenger 2007 Mustang Fruit Gift Baskets Carbon Monoxide Detector Cheap Birth Announcements Modern Lamps Free Ringtones Government Cash Grants Sex Toys Education Grants Cheapest Magazine Subscription Ideal Cut Diamonds Bar Stools Uk How To Make Extra Money Colleges And Universities Nifty Personal Trainer Church Fundraising Trampoline Nets Mls Real Estate Listing Portable Gps Free Playstation 3 Cheap Area Rugs Commercial Christmas Lights Seattle Dating Ovulation Kits Yankee Candle Panama City Beach Movies On Dvd Guitar Lessons Gps Systems Albertsons Supermarket Dodge Truck Accessories Make Money Now Repo Auto Auctions Fly Fishing Gear Panama City Beach Fl Last Minute Airline Tickets Mercedes Wagons Granite Tile Flooring Freelance Graphic Design Silk Bedding Personal Christmas Gift Baskets Ram Earn Easy Money Commercial Real Estate Auto Body Repair Mini Cooper Convertible Repo Auto Auction Adelphia Spyware Hyundai Deals Cheap Toys Bahamas Travel Buy Computer Desk Treadmill Acura Tl King Mattress Patio Furniture Baby Baptism Gifts Fragrance Perfume Gift Modeling Classes Home Entertainment Centers Plastic Buy Computer Desks Trip To Vegas Atlantic City Hotel Discounts Virginia Beach Tractor Parts Psoriasis Personal Loans For People With Bad Credit Online Background Check Wicker Furniture Antivirus Work From Home Business Clitoral Vibrators Emphysema Paper Airplanes Discount Photoshop Rabbit Pearl Vibrator Backpacking Sleeping Bags Home Business Duke Energy Engraved Personalized Gifts Women's Sunglasses English To Spanish Translator Kitchen Remodel Design Cheap Airfares Costa Rica Travel Makeup Designer Sunglasses Mustang Dealership Air Fare Flooring Free Email Account Dodge Neon Rabbit Sex Toy Old Dominion Home Garage Doors Baby Gap Travel To Las Vegas Hacking Anxiety Problems Pre Menopause Sliding French Doors People.com Graphic Design Firm Cheap Vacation Deals 12ft Trampoline Weber Gas Grill Abs Exercise Equipment Logo Designers Sauna Accessories Discount New Cars Magic Bullet Juicer Certified Diamonds Cheap Fireplace Cheap Christmas Cards Cheap Checks Carhartt Jackets Divorce Prescription Sunglasses Wrought Iron Beds Homebased Business Opportunity Big Truck Trader Unique Business Opportunity Engraved Signs Birth Control Pills Internet Money Repossessed Car Auctions Catherine Zeta-jones Swiss Army Wedding Dj Chrysler Jeep Men's Watches Best Dating Sites Auto Transmission Repair Christian Dating Service Unique Birthday Gift North Myrtle Beach Men's Shorts Ephedra Making Money Specialty Gift Baskets Work At Home Jobs World Of Warcraft Leveling Service No Credit Personal Loans Safety Glasses Bulgaria Microsoft Office Suite Gift Baskets Delivered New York Hotels Make Money Online Business Mattress Sales Resin Wicker Furniture Repo Car Auctions Beauty And The Beast New York Palm Pda Estate Auctions Chevy Chase Web Proxy Wireless Cell Phone Buy Vibrators Work At Home Data Entry Art Supplies Canvas Waterproof Vibrators Top Home Based Business Notebook Memory Zip Code Easy Money Online Panama Canal How To Make Money From Home Plasma Entertainment Center Panama Mardi Gras Masks Leah Remini Cheap Diamond Rings Download Porn Movies Air Tran Atlantis Laminate Flooring Mary Kay Representative Executive Gift Baskets Las Vegas Getaways Unfinished Furniture Custom Promotional Products Real Dex Cheap Gift Baskets Whirlpool Appliance Local Day Spa Nashville Tennessee Fastest Way To Make Money Life Fitness Treadmill First Birthday Invitations Fancy Dog Beds Next Uk Help Justin Timberlake Tickets Mary Kay Beauty Consultant Menopause Help Auto Repair Contemporary Engagement Rings Anastasia Trampoline Enclosure Airtran Airways Alzheimers Disease New York Dating Online Home Based Business Cell Phone Accessories Discount Mattress Cheap Pet Supplies Drunk Driving Accidents Identity Theft Restoration Mitsubishi Austin Atlanta Modeling Jobs Men's Dress Shirts Grants Money Three Stone Diamond Ring Tax Preparation Service Custom Invitations Firewall Software Chronic Back Pain Modern Lamps Diaper Bags Magic Bullet Processor Parkinson's Disease House Signs Advertising Agencies Dentist Office Wedding Gift Baskets Home Office Computer Desk Dump Truck Best Colleges Disney World Florida Desert Miami Grand Canyon Train Concert Tickets Shower Doors Vehicle Signs Best Skin Care Braces For Teeth Alaska Birthday Gift Baskets How To Make Money Online Notebooks Affordable Website Design Buy Hoodia To Make Easy Money Find Cheap Airline Tickets Coffee Makers Personalized Banners Boating Womens Perfume Coffee Maker Bahamas Trip Work From Home Business Legitimate Work From Home Design Tablet Pc Home Solar Panels Security Guard Company Desk Clocks Life Is Good Clothing World Of Warcraft Power Leveling Cheap Logo Chrysler Convertible Big Truck Trader Belgium Pilates Modular Homes Builder Revolution Pet 2007 Chrysler Business Grants Skin Care Products For Men Nra Men's Boxers Cheap Magazines Fireplace Doors Frameless Glass Shower Doors Make Extra Money Double Savannah Georgia Phone Aftermarket Tractor Parts Leah Remini Signs And Banners Zero Gravity Commercial Ice Maker King Size Mattress Snack Gift Baskets Data Entry Work From Home Tax Id New Homes For Sale Cheap Car Parts Independent Auto Dealer Avenue Q Cloth Banners Ford Trucks Display Signs Easy Ways To Earn Money Crafts Supplies Cheap Granite Countertops Us Flags Best Home Business Opportunity Breast Reduction Cookie Gift Baskets Compact Tractors Mass Trampoline Sales Cheap Engagement Rings Mercedes Convertibles Travel To Las Vegas Dating Cheap Prepaid Cell Phone Seagrass Rugs Free Cell Phone Plan The Mls Sauna Accessories Tax Preparation Wood Computer Desk Panama City Fla Donald Trump Princess Cut Engagement Rings Cheap Window Glass Copy Paper Car Auto Repair Earn Easy Money Work From Home Jobs Glass French Doors Www Applebees Com Make Money On Line Vitamins And Supplements Stock Kitchen Cabinets Queen Size Mattress Kids Computer Desk Denver Personal Trainer Movado Watches Diamond Solitaire Engagement Rings Photo Birthday Invitations Oak Wine Rack Buy Modular Homes Work At Home Data Entry Bahamas All Inclusive Blue Cross Blue Shield Of Alabama Portable Steam Sauna Photography British Newsgroups Vacation Rentals Medicine Advertising Agency Careers Hd Tv Ergonomic Computer Desk Loose Diamonds Exercise Equipment Sales Anti Spyware Software Overcoming Anxiety Bass Work At Home Moms Gold Bullion Coins Cheap Lcd Tv Discount Vacations Free Ring Tones Back Massage Lodging Cheap Costume Jewelry Amber Earrings Walt Disney World Florida Information Technology Gourmet Coffee Store Live Maine Lobster Small Wrought Iron Beds Gmc Canada Staples Woodstock Black Dating Service Cheap Flights To Paris Art Supplies Online Bankruptcy Court Estate Auctions

nedjelja, 24. veljače 2008.

When not to normalize your database

Database normalization is a formal process of designing your database to eliminate redundant data, utilize space efficiently and reduce update errors. Anyone who has ever taken a database class has it drummed into their heads that a normalized database is the only way to go. This is true for the most part . However there are certain scenarios where the benefits of database normalization are outweighed by its costs. Two of these scenarios are described below.

Immutable Data and Append-Only Scenarios

Pat Helland, an enterprise architect at Microsoft who just rejoined the company after a two year stint at Amazon, has a blog post entitled Normalization is for Sissies where he presents his slides from an internal Microsoft gathering on database topics. In his presentation, Pat argues that database normalization is unnecessary in situations where we are storing immutable data such as financial transactions or a particular day's price list.

When Multiple Joins are Needed to Produce a Commonly Accessed View

The biggest problem with normalization is that you end up with multiple tables representing what is conceptually a single item. For example, consider this normalized set of tables which represent a user profile on a typical social networking site.

user table
user_id first_name last_name sex hometown relationship_status interested_in religious_views political_views
12345 John Doe Male Atlanta, GA married women (null) (null)
user_affiliations table
user_id (foreign_key) affiliation_id (foreign key)
12345 42
12345 598
affiliations table
affiliation_id description member_count
42 Microsoft 18,656
598 Georgia Tech 23,488
user_phone_numbers table
user_id (foreign_key) phone_number phone_type
12345 425-555-1203 Home
12345 425-555-6161 Work
12345 206-555-0932 Cell
user_screen_names table
user_id (foreign_key) screen_name im_service
12345 geeknproud@example.com AIM
12345 voip4life@example.org Skype
user_work_history table
user_id (foreign_key) company_affiliation_id (foreign key) company_name job_title
12345 42 Microsoft Program Manager
12345 78 i2 Technologies Quality Assurance Engineer

This is the kind of information you see on the average profile on Facebook. With the above design, it takes six SQL Join operations to access and display the information about a single user. This makes rendering the profile page a fairly database intensive operation which is compounded by the fact that profile pages are the most popular pages on social networking sites.

The simplest way to fix this problem is to denormalize the database. Instead of having tables for the user’s affiliations, phone numbers, IM addresses and so on, we can just place them in the user table as columns. The drawback with this approach is that there is now more wasted space (e.g. lots of college students people will have null for their work_phone) and perhaps some redundant information (e.g. if we copy over the description of each affiliation into an affiliation_name column for each user to prevent having to do a join with the affiliations table). However given the very low costs of storage versus the improved performance characteristics of querying a single table and not having to deal with SQL statements that operate across six tables for every operation. This is a small price to pay.

As Joe Gregorio mentions in his blog post about the emergence of megadata, a lot of the large Web companies such as Google, eBay and Amazon are heavily into denormalizing their databases as well as eschewing transactions when updating these databases to improve their scalability.

Maybe normalization is for sissies…

UPDATE: Someone pointed out in the comments that denormalizing the affiliations table into user's table would mean the member_count would have to updated in thousands of user's rows when a new member was added to the group. This is obviously not the intent of denormalization for performance reasons since it replaces a bad problem with a worse one. Since an affiliation is a distinct concept from a user, it makes sense for it to have it's own table. Replicating the names of the groups a user is affiliated with in the user table is a good performance optimization although it does mean that the name has to be fixed up in thousands of tables if it ever changes. Since this is likely to happen very rarely, this is probably acceptable especially if we schedule renames to be done by a cron job during offpeak ours On the other hand, replicating the member count is just asking for trouble.

UPDATE 2: Lots of great comments here and on reddit indicate that I should have put more context around this post. Database denormalization is the kind of performance optimization that should be carried out as a last resort after trying things like creating database indexes, using SQL views and implementing application specific in-memory caching. However if you hit massive scale and are dealing with millions of queries a day across hundreds of millions to billions of records or have decided to go with database partitioning/sharding then you will likely end up resorting to denormalization. A real-world example of this is the Flickr database back-end whose details are described in Tim O'Reilly's Database War Stories #3: Flickr which contains the following quotes

tags are an interesting one. lots of the 'web 2.0' feature set doesn't fit well with traditional normalised db schema design. denormalization (or heavy caching) is the only way to generate a tag cloud in milliseconds for hundereds of millions of tags. you can cache stuff that's slow to generate, but if it's so expensive to generate that you can't ever regenerate that view without pegging a whole database server then it's not going to work (or you need dedicated servers to generate those views - some of our data views are calculated offline by dedicated processing clusters which save the results into mysql).

federating data also means denormalization is necessary - if we cut up data by user, where do we store data which relates to two users (such as a comment by one user on another user's photo). if we want to fetch it in the context of both user's, then we need to store it in both shards, or scan every shard for one of the views (which doesn't scale). we store alot of data twice, but then theres the issue of it going out of sync. we can avoid this to some extent with two-step transactions (open transaction 1, write commands, open transaction 2, write commands, commit 1st transaction if all is well, commit 2nd transaction if 1st commited) but there still a chance for failure when a box goes down during the 1st commit.

we need new tools to check data consistency across multiple shards, move data around shards and so on - a lot of the flickr code infrastructure deals with ensuring data is consistent and well balanced and finding and repairing it when it's not."

The part highlighted in red is also important to consider. Denormalization means that you you are now likely to deal with data inconsistencies because you are storing redundant copies of data and may not be able to update all copies of a column value simultaneously when it is changed for a variety of reasons. Having tools in your infrastructure to support fixing up data of this sort then become very important.

Nema komentara: