![]() |
The Indiana Jones Warehouse |
The more stuff you own, the more time you spend managing it.
While this insight from my mother was brought to life when referring to clothing and home furnishings, it is hugely applicable to the world of data storage and analysis.
Whenever we start a data project with a client, we are always shocked by how much data they store that is simply not used. Not data that they generate, either. One of our clients had an SQL database with 1,800 economic indicators stored for 210 countries going back several decades. After talking with analysts directly, we discovered there were three HUGE problems with this approach:
The first problem was that most of the data pulls included only two core indicators: GDP and population. The second was that most of the data pulls were for six unique countries (USA, Japan and the BRICs). And the final problem was that various parts of the data were updated monthly, which meant that at least one member of the analyst team was responsible for refreshing this data twelve times a year, taking him two to three full days to complete each update.
So, let's review:
The database in question had (1,800 indicators x 210 countries) 378,000 unique data series.
The most commonly used ones? Only twelve.
The company was spending 200-300 man hours a year to update one data set - of which 99.99968% was almost never being used. The associated cost of updating this data doesn't even include direct maintenance, storage requirements or troubleshooting.
But why?
When asked why they were holding on to this much unused data, the client gave the response that all hoarders give:
"Because we might need it"
Cure inventory waste
Here are four ways to prevent huge amounts of inventory waste in analytical databases.
1. Flexibility is everything. Flexibility allows you to store only the data you need; adding and subtracting data when your needs change. There are two ways to practice flexibility. First is the manner in which its structured and stored. SQL databases can be too rigid for some applications. While it is good for some data, make sure you look at NoSQL options like MongoDB or CouchDB that can be better for storing price strips or economic indicators. Second, inserting and retrieving data must be easy. Simply put, if it is hard to put new data in or pull data out then adjusting to new data requirements become nearly impossibles.
2. Think about the use cases first. This applies to reducing many different kinds of data waste, but database architects and analysts must come together on needs and requirements before engaging in a project. Conversations between the two must be forward looking, thinking thoroughly about how the database will adapt to rapidly changing requirements.
3. Keep the data updated. This seems obvious, but data that is out of date helps no one. Keep it fresh through manual updating or an automated ETL process, and it will be used regularly. Otherwise, analysts will find different solutions (like going directly to the source) which will drive down the value of your database enormously.
4. Use analytics to track usage. IT must understand what data their analysts are using, how frequently and for what purpose. By using services like Splunk, Kibana or the analytics in SQL Server, you can detect which data is being used and by whom, keeping your database lean and useful.
Readers: What did we miss? How do you and your team make sure that the data you maintain is being used effectively?
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. IEEE Projects for CSE in Big Data But it’s not the amount of data that’s important. Final Year Project Centers in Chennai It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
ReplyDeleteSpring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Corporate TRaining Spring Framework the authors explore the idea of using Java in Big Data platforms.
Specifically, Spring Framework provides various tasks are geared around preparing data for further analysis and visualization. Spring Training in Chennai
learn360digitmg data scientist training
ReplyDeleteI would like to say that this blog really convinced me to do it! Thanks, very good post.
ReplyDelete360digitmg artificial intelligence course
Attend The Artificial Intelligence course From ExcelR. Practical Artificial Intelligence course Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Artificial Intelligence course.
ReplyDeleteArtificial Intelligence Course
I have voiced some of the posts on your website now, and I really like your blogging style. I added it to my list of favorite blogging sites and will be back soon ...
ReplyDeleteDigital Marketing Training in Bangalore
The Extraordinary blog went amazed by the content that they have developed in a very descriptive manner. This type of content surely ensures the participants explore themselves. Hope you deliver the same near the future as well. Gratitude to the blogger for the efforts.
ReplyDeleteMachine Learning Course in Bangalore
I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
ReplyDeleteData Analytics Course
I am really enjoying reading your well written articles. I am looking forward to reading new articles. Keep up the good work.
ReplyDeleteData Science Courses in Bangalore
ReplyDeleteI am sure it will help many people. Keep up the good work. It's very compelling and I enjoyed browsing the entire blog.
Business Analytics Course in Bangalore
Thanks for posting the best information and the blog is very helpful.
ReplyDeleteArtificial Intelligence Training in Bangalore | Artificial Intelligence Online Training
Python Training in Bangalore | Python Online Training
Data Science Training in Bangalore | Data Science Online Training
Machine Learning Training in Bangalore | Machine Learning Online Training
AWS Training in bangalore | AWS Training
UiPath Training in Bangalore | UiPath Online Training
I have read your article, it is very informative and useful to me, I admire the valuable information you offer in your articles. Thanks for posting it ...
ReplyDeleteData Science Course in Durgapur
Wonderful illustrated information. Thank you. It will certainly be very useful for my future projects. I would love to see more articles on the same topic!
ReplyDeleteData Science Training in Bangalore
Really impressed! Information shared was very helpful Your website is very valuable. Thanks for sharing.
ReplyDeleteBusiness Analytics Course in Bangalore
It was a wonderful opportunity to visit this type of site and I am happy to hear about it. Thank you very much for giving us the opportunity to have this opportunity.
ReplyDeleteData Analytics Course in Durgapur
Spoken English Classes in Pune
ReplyDeleteIt has lessened the load on the people as it works on the data patterns that minimize the data volumedata science course in ghaziabad.
ReplyDeleteIt was good experience to read about dangerous punctuation. Informative for everyone looking on the subject. data science training institute in bangalore
ReplyDeleteOnline Machine learning Classes in Pune
ReplyDeleteMachine learning Training in Pune
Machine learning Classses in Pune
Machine learning Classes in Pune
Online Machine learning Training in Pune
Machine learning Classes in Pune
Machine learning Course in Pune
Great post, keep posting Software Testing Training in Pune
ReplyDelete