Valentine’s Day was last week, and at the invitation of my friends here at fusionSpan, I’m popping in to share a few thoughts about my one true love – data!
I Love that We Are (Finally) Moving Past Just Talking about Data
The past 12 months have brought change in just about every corner of our lives. In the spirit of making sparkling spiked lemonade out of an abundance of lemons, one positive has been the increased acceptance of the critical role data can/should/does play in our organizations. I credit some of this to the omnipresence of COVID dashboards, charts, and other visualizations. It’s a smart tactic — pairing visual with narrative engages both our quick-thinking reactionary instincts and our deep-thinking analytical processes. We see a picture that uses elements like color and directionality to immediately communicate whether something is good or bad (or neutral) as we digest the subject matter expert’s interpretation to understand the information and how to apply it. Associations are eager to do the same and are working to acquire or bolster the necessary resources — whether it’s to support more robust and/or timely internal metrics or to dedicate resources to a deep analysis of the “whys” or to establish their role as a thought leader and resource by providing meaningful information to their membership.
The rest, I credit to the tireless work of the many data champions in the association space who have been patiently educating us for years about not just the value but the necessity of having robust data strategies in place at our organizations. To navigate these uncharted waters, we need to do simple things like parsing normal fluctuations from sound-the-alarm spikes while also generating resources that support our role as thought-leaders or experts in our respective industries. It’s purely personal, anecdotal data, but historically I’ve observed a tendency for us in the association space to sell ourselves short, so to speak, because we don’t have as much data as the tech behemoths. Now, though, we are seeing that you don’t need “big data” to have impactful data and that the worst thing you can do with data is not use it.
I Love that Technologies Are Becoming More Affordable
Conveniently, just as we are finally putting our data where our mouth is (that’s the expression, right?), the tools and technologies that go with are more accessible and affordable than ever. Low- and no-code platforms for connecting disparate data sources are increasingly available and ready to use right out of the proverbial box. And it’s not just connectors – layering in elements of machine learning or artificial intelligence can be done with a few clicks instead of weeks spent wrestling with packages and code in an IDE. Even the dataset and training don’t have to be hurdles anymore thanks to services like Amazon’s Mechanical Turk and SageMaker.
With all of this analytical power at our fingertips, we humans are freed up to do the interpreting and communicating. Instead of spending time matching records from separate sources or combing through and tagging content for a taxonomy project, we can skip to the fun part and use the data to understand and improve our programs and services. Maybe we can use our newfound free time (ha!) to build out a knowledge base so that we can finally deploy that chatbot we keep talking about…
And I Even Love that There’s Still Work to Be Done
So, everyone is all-in on data and the tools are cheap and readily available…that’s it, right? Not so fast. There’s a general rule of thumb in analytics we affectionately call the “80-20 rule” — 80% of your time is spent gathering and preparing the data and only 20% is spent actually analyzing and gaining insights from the data. I happen to find that the split is more often along the lines of 99-1 (and I’ve probably got the timesheet data to prove it but I’m too afraid to look!) And this isn’t even including the work of defining the right questions to ask, but that’s a blog post for another day. So how can we get closer to that 80-20 (or better) split?
Committing to Data Hygiene
Your data is only as useful as it is clean. The odd record or two with bad data might not be the end of the world, but if prepping your data for analysis involves an elaborate series of transforms aimed at correcting common mistakes or historically mis-captured information, it might be time to tackle the issues head-on. My recommendation, however, is not to put data projects on hold until “after cleanup” (which is about the same as always saying “tomorrow”). Instead, use your data projects to identify areas of opportunity for cleanup. Work with what you’ve got. Document known issues/gotchas, fix data at the source whenever possible, and establish data collection norms to do better going forward. It may not happen overnight, but incremental progress will build towards a beautiful future.
Defining and Understanding Your Data Model
Ideally, you’ve already gone through a data warehousing project and have end-user-ready data sources available in your BI tool of choice. But, if you haven’t (or even if you have), it’s always good to have a current “map” of your data ecosystem with details on how the different data sources can be related to each other for analysis. Note the unique identifiers that link records as well as the granularity of the dataset in order to ensure you aggregate the data correctly as well as filter out any duplication. This work may not sound the most exciting, but it will pay dividends when you start tackling new projects or using any of the nifty new ML or AI offerings.