by Sam McFarland , on December 7, 2016

Estimated reading time: 6 minutes and 18 seconds
As a data analyst, I’ve been fortunate enough to have been a part of numerous big data initiatives with several organizations. Some of those projects were big successes, but others were epic failures (big data can be a big challenge). From those failures, I’ve developed a few guiding principles that I use when tackling any new data project, in any industry. So, in an effort to prevent others from repeating my past mistakes, here are my three main tips to a successful foray into big data …
Start Smaller Than You ExpectIt can be hard not to set lofty goals for a project. And with 90% of unstructured data currently unused , most organizations really do have seemingly endless opportunities to put data to use. But I learned the hard way that it’s best to keep the scope for aninitial project relatively small.
In one organization I was a part of, we had a large client who wanted to collect transactional, clickstream, marketing analytics and CRM data to improve their customer loyalty card program. My team created a fantastic plan to engineer the data pipelines necessary to begin piping from each source and build a dashboard for easy analysis. The problem was, we decided to set everything up before beginning any analysis work . Time to first value was a solid 3-4 months. Eventually, our relationship with that client went south.
Suddenly, my team found itself a lot smaller and forced, for the first time ever, to prove that we could deliver analysis results quickly. One major prospect promised to sign a contract, but would only uphold their end if they experienced value within 3-4 weeks . Out of money, we had no choice but to scramble to meet their demands.
To do that required more focus than we were used to. We started by asking ourselves these simple questions:
What problem are they trying to solve? What is the potential value/impact to the business? How will wemeasure a “successful project”?We wrote the answers down, set to work and referred back to them frequently to maintain focus. We weren’t about to let scope creep beyond the basics; we’d be sunk.
Turns out, we could move forward quickly if we were willing to roll up our sleeves and get started with baby steps instead of mustering energy for a big leap. Setup for data sources like clickstream, marketing analytics and CRM data could be placed in priority order and meanwhile, we could start pulling in easy-to-reach data, like transactional records, immediately.
Better yet, we realized that we were piloting big data integration for an entire organization. Once we knocked an initiative out of the park for one part of the company, others werewilling to invest the resources necessary to tackle much larger projects.
Know More Than Your DataAs you assemble your crack team of data whizzes, you’ll need more than standard project roles (like project manager and project sponsor). Early on, I worked at Facebook, who did this well. Every internal project was comprised of three major skillsets:
Data Developer/Engineer - This person gets the data from all those disparate locations into a central location and a format that is conducive to analysis. Data Analyst/Scientist - Somebody has to dig in and find meaning in the madness. If you’re starting to build out basic reporting and dashboards, you can probably get by with a good data analyst. If you’re looking to work with unstructured data or do more advanced analysis (like machine learning), you’re going to want to invest in a data scientist. Subject Matter Expert - Even the best data scientist in the world isn’t going to be able to solve your problems without adequate knowledge of your business. It’s this person’s job to make sure that the other members of the project team have the context and knowledge to build a meaningful solution.Facebook did this so well, in fact, that I didn't pause to consider exactly why projects were successful. Which meansI didn’t instinctively think to replicate this until it was nearly too late. During my early days at Astronomer, my team supplied data engineers and a data scientist for acustomer, as usual, and assumed that our contact there would serve as the subject matter expert. Often,that happens naturally. At the same time, however, our customer assumed that we would do the work on their behalf without much (if any) need for input. After three months of no real value, we finally put two and two together andrectified the problem.
As may seem obvious, subject matter experts are the ones most frequently overlooked in a data initiative, which is detrimental, if not dangerous , to an organization.
Choose a Practical Technology SolutionThe third thing to keep in mind is to be practical when choosing technology. All of us want what'snewest and most cutting edge. It makes sense: there’s always some great new tech with soaring promises. But sometimes it isn’t the best solution for the problem at hand. Here are a few areas where I now pursue function over fashion:
Data Processing : I was a part of a company thatdecided to transfer data from Oracle Exadata (a paid storage solution) to Hadoop (an open source option). We needed to cut costs, and Hadoop was much cheaper, yet had comparable capabilities. The problem is, while the code was open source, implementing the technology and extracting, transforming and loading all thatdata was a nightmare. What we didn’t realize is that, unless we were regularly processing and analyzing terabytes or even petabytes of data (we weren’t), setting up a Hadoop cluster created unnecessary overhead. Instead, we should have considered a traditional relational database management system (RDBMS) like Amazon RDS or NoSQL solution. Data Analytics : Right out of college, I worked at a bank. There, nobody had access to data except the IT department. Getting information literally took weeks. Then, on my first day at Facebook, I was given access to all of their data. Even non-tech roles were encouraged to learn SQL, so they could write their own reports and conduct their own analysis. This cemented for me that data and insights are only valuable if they’re easily accessible by those who canbenefit from them. Consider interactive dashboards