Posts Tagged ‘analytics’

“Big Data” Success Starts With Good Data Governance

May 19, 2014

(This post appeared on our successor blog, The Analysights Data Mine, on Friday, May 9, 2014). 

As data continues to proliferate unabated in organizations, coming in faster and from more sources each day, decision makers find themselves perplexed.  Decision makers struggle with several questions: How much data do we have? How fast is it coming in? Where is it coming from? What form does it take? How reliable is it? Is it correct? How long will it be useful?  And this is before they even decide what they can and will do with the data!

Before a company can leverage big data successfully, it must decide upon its objectives and balance that against the data it has, regulations for the use of that data, and the information needs of all its functional areas. And it must assess the risks both to the security of the data and to the company’s viability.  That is, the company must establish effective data governance.

What is Data Governance?

Data governance is a young and still evolving system of practices designed to help organizations ensure that their data is managed properly and in the best interest of the organization and its stakeholders.  Data governance is an organization’s process for handling data by leveraging its data infrastructure, the quality and management of its data, its policies for using its data, its business process needs, and its risk management needs.  An illustration of data governance is shown below:

governance

Why Data Governance?

Data has many uses; comes in many different forms; takes up a lot of space; can be siloed, subject to certain regulations, off-limits to some parties but free and unlimited to others; and must be validated and safeguarded.  But just as much, data governance ensures that the business is using its data toward solving its defined business problems.

The explosion of regulations, such as Sarbanes-Oxley, Basel I, Basel II, Dodd-Frank, HIPAA, and a series of other rules regarding data privacy and security are making the role of data governance all the more important.

Moreover, data comes in many different forms. Companies get sales data from the field, or from a store location; they get information about their employees from job applications. Data of this nature is often structured.  Companies also get data from their web logs, from social media such as Facebook and Twitter; they also get data in the form of images, text, and so forth; these data are unstructured, but must be managed regardless.  Through data governance, the company can decide what data to store and whether it has the infrastructure in place to store it.

The 6 Vs of Big Data

Many people aware of big data are familiar with its proverbial “3 Vs” – Volume, Variety, and Velocity.  But Kevin Normandeau, in a post for Inside Big Data, suggests that three more Vs pose even greater issues: Veracity (cleanliness of the data), Validity (correctness and accuracy of the data), and Volatility (how long the data remains valid and should be stored).  These additional Vs make data governance an even greater necessity.

What Does Effective Data Governance Look Like?

Effective data governance begins with designation of an owner for the governance effort – an individual or team who will be held accountable.

The person or team owning the data governance function must be able to communicate with all department heads to understand the data they have access to, what they use it for, where they store it, and what they need it for.  They must also be adroit in their ability to work with third party vendors and external customers of their data.

The data governance team must understand both internal policies and external regulations governing the use of data and what specific data is subject to specific regulations and/or policies.

The data governance team must also assess the value of the data the company collects; estimate the risks involved if a company makes decisions based on invalid or incomplete data, or if the data infrastructure fails, or is hacked; and design systems to minimize these risks.

The team must then be able to draft, document, implement, and enforce its governance processes once data has been inventoried and matched to its relevant constraints, and the team develops its processes for data collection and storage.  The team must then be able to train employees of the organization in the proper use and collection of the data, so that they know what they can and cannot do.

Without effective data governance, companies will find themselves vulnerable to hackers, fines, or other business interruptions; they will be less efficient as inaccurate data will lead to rework and inadequate data will lead to slower, less effective decision making; and they will be less profitable as lost data or incomplete data will often cause them to miss opportunities or take incorrect actions due to decisions on such data.  Good data governance will ensure that companies get the most out of their data.

 

****************************************************

Follow Analysights on Facebook and Twitter!

Now you can find out about new posts to both Insight Central and our successor blog, The Analysights Data Mine, by “Liking” us on Facebook (just look for Analysights), or by following @Analysights on Twitter.  Each time a new post appears on Insight Central or The Analysights Data Mine, you will be notified by either your Facebook Newsfeed or your Twitter feeds.  Thanks!

 

Big Data, Big Bucks

May 6, 2014

(This post appeared last week on our successor blog, the Analysights Data Mine)

In their 1996 bestselling book, The Millionaire Next Door, Thomas J. Stanley and William D. Danko constructed profiles of the typical American millionaire.  One common characteristic the authors observed was that these millionaires “chose the right occupation.”  When Stanley and Danko wrote Millionaire, I doubt many of their research subjects were data analysts, predictive modelers, data scientists, or other “Big Data” professionals; but if they were to write a new edition today, I’ll bet there would be a lot more on the list.  “Big Data” jobs seem to be “the right occupation” today.

In a recent interview with the Wall Street Journal, veteran analytics recruiter Linda Burtch of Burtch Works predicted that job candidates with little familiarity with “Big Data” will face a “permanent pink slip,” while observing that analytics professionals earn a median base salary of $90,000 per year. Ms. Burtch distinguishes between “analytics” professionals (who typically deal with structured data sets) and “data scientists” (who typically work with large, unstructured data sets), when classifying income levels.  Data scientists, Burtch Works found, make a median base salary of $120,000.

Even more impressive is the median base salaries of entry level professionals, those with three years’ experience or less: $65,000 for analytics professionals and $80,000 for data scientists.  At nine or more years’ experience, the median base salaries rise to $115,000 and $150,000, respectively.

Much of the reason for the hefty salaries is that companies don’t often understand what skill sets they need.  Ms. Burtch mentions this in her comments to the Wall Street Journal, and I indicated as much in a previous blog post.  Add to that the fact that needed skill sets are also highly specialized and relatively few professionals have such skills, or a large pool of them.  Because of the scarcity, candidates can command such high salaries.

For companies, this suggests that in order to get the most value out of a “Big Data” hire, it must first decide the typical projects it will expect the candidate to perform, and then base the required skill set and years of experience accordingly.  Then the company can budget the salary it is willing to pay.  This will ensure that the company isn’t hiring someone with 10 years’ experience in data analytics and paying that person $120,000 per year just to pull data for mailing lists, when it should have hired someone out of college for about one-third of that.

For candidates, the breadth of skill sets employers seek in “Big Data” professionals suggests they can maximize their salaries by continuing to broaden their skills and experience within the data realm.  For example, someone with years of SAS programming and SQL experience may branch out to other programming tools, such as R and Python. Or, such a professional may expand his or her skill set by developing proficiency in data visualization tools such as Tableau of QLIKVIEW.

Working in “Big Data” may not make someone “the millionaire next door,” but it may bring him or her pretty close.

 

****************************************************************************************************************************

Follow Analysights on Facebook and Twitter!

Now you can keep track of new posts on this site and our successor site, the Analysights Data Mine, by “Liking” us on Facebook, or following us at Twitter: @Analysights.  Each time we post something new, you will automatically be notified through your Facebook newsfeed or your Twitter feeds.  We look forward to seeing you!

Big Data Success Starts With Well-Defined Business Problem

April 18, 2014

(This post also appears on our successor blog, The Analysights Data Mine).

Lots of companies are jumping on the “Big Data” bandwagon; few of them, however, have given real thought to how they will use their data or what they want to achieve with the knowledge the data will give them.  Before reaping the benefits of data mining, companies need to decide what is really important to them.  In order to mine data for actionable insights, technical and business people within the organization need to discuss the business’ needs.

Data mining efforts and processes will vary, depending on a company’s priorities.  A company will use data very differently if its aim is to acquire new customers than if it wants to sell new products to existing customers, or find ways to reduce the cost of servicing customers.  Problem definition puts those priorities in focus.

Problem definition isn’t just about identifying the company’s priorities, however.  In order to help the business achieve its goals, analysts must understand the constraints (e.g., internal privacy policies, regulations, etc.) under which the company operates, whether the necessary data is available, whether data mining is even necessary to solve the problem, the audience at whom data mining is directed, and the experience and intuition of the business and technical sides.

What Does The Company Want to Solve?

Banks, cell phone companies, cable companies, and casinos collect lots of information on their customers.  But their data is of little value if they don’t know what they want to do with it.  In the banking industry, where acquiring new customers often means luring them away from another bank,  a bank’s objective might be to cross-sell, or get its current depositors and borrowers to acquire more of – its products, so that they will be less inclined to leave the bank.  If that’s the case, then the bank’s data mining effort will involve looking at the products its current customers have and the order and manner in which the customer acquired those products.

On the other hand, if the bank’s objective is to identify which customers are at risk of leaving, its data mining effort will examine the activity of departing households in the months leading up to their defection, and compare it to those households it retained.

If a casino’s goal is to decide on what new slot machines to install, its data mining effort will look at the slot machine themes its top patrons play most and use that in its choice of new slot machines.

Who is the Audience the Company is Targeting?

Ok, so the bank wants to prevent customers from leaving.  But do they want to prevent all customers from leaving?  Usually, only a small percentage of households account for all of a bank’s profit; many banking customers are actually unprofitable.  If the bank wants to retain its most profitable customers, it needs only analyze that subgroup of its customer base.  The bank’s predictions of its premier customers’ likelihood to leave based on a model developed on all its customers would be highly inaccurate.  In this case, the bank would need to build a model only on its most profitable customers.

Does the Problem Require Data Mining?

Data mining isn’t always needed.  Years ago, when I was working for a catalog company, I developed regression models to predict which customers were likely to order from a particular catalog.  When a model was requested for the company’s holiday catalog, I was told that it would go to 85 percent of the customer list.  When such a large proportion of the customer base – or the entire customer base for that matter – is to receive communication, then a model is not necessary.  More intuitive methods would have sufficed.

Is Data Available?

Before a data mining effort can be undertaken, the data necessary to solve the business problem must be available or obtainable.  If a bank wants to know the next best product to recommend to its existing customers, it needs to know the first product these customers acquired, how they acquired it, the length of time between their acquisition of their second product, then their third product, and so forth. The bank also needs to understand what products its customers acquired simultaneously (such as a checking account and a credit card), current activity with those products, and the sequence of product acquisition (e.g., checking account first, savings account second, certificate of deposit third, etc.).

It is extremely important that analysts consult both those on the business side and the IT department about the availability of data.  These internal experts often know what data is collected on customers, where it resides, and how it is stored.  In many cases, these experts may have access to data that doesn’t make it into the enterprise’s data warehouse.  And they may know what certain esoteric values for fields in the data warehouse mean.  Consulting these experts can save analysts a lot of time in understanding the data.

Under What Constraints Does the Business Operate?

Companies have internal policies regulating how their operation; are subject to regulations and laws governing the industries and localities in which they operate; and also are bound by ethical standards in those industries and locations.

Often, a company has access to data that, if used in making business decisions, can be illegal or viewed as unethical.  The company doesn’t acquire this data illegally; the data just cannot be used for certain business practices.

For example, I was building customer acquisition models for a bank a few years ago.  The bank’s data warehouse had access to summarized credit score statistics by block groups, as defined by the U.S. Bureau of the Census.  However, banks are subject to the Community Reinvestment Act (CRA), a 1977 law that was passed to prevent banks from excluding low- to moderate-income neighborhoods in their communities from lending decisions.  Obviously, credit scores are going to be lower in lower-income areas. Hence, under CRA guidelines, I could not use the summarized credit statistics to build a model for lending products.  I could, however, use those statistics for a model for deposit products; for post campaign analysis, to see which types of customers responded to the campaign; and also to demonstrate compliance with the CRA.

In addition, the bank’s internal policies did not allow the use of marital status in promoting products.  Hence, when using demographic data that the bank purchased, I had to ignore the field, “married” when building my model.  In cases like these, less direct approaches can be used.  The purchased data also contained a field called “number of adults (in the household).  This was totally appropriate to use, since it did not necessarily mean that a household with two adults was a married-couple household.

Again, the analyst must consult the company’s business experts so it can understand these operational constraints.

Are the Business Experts’ Opinions and Intuition Spot-On?

It’s often said that novices make mistakes out of ignorance and veterans make mistakes out of arrogance.  The business experts have a lot of experience in the company and a great deal of intuition, which can be very insightful.  However, they can be wrong too.  With every data mining effort, the data must be allowed to tell the story.  Does the data validate what the experts say?  For example, most checking accounts are automatically bundled with a debit card; a bank’s business experts know this; and the analysis will often bear this out.

However, if the business experts say that a typical progression in a customer’s banking relationship starts with demand deposit accounts (e.g., checking accounts) then consumer lending products (e.g., auto and personal loans), followed by time deposits (e.g., savings accounts and certificates of deposit), does the analysis confirm that?

 

Problem definition is the hardest, trickiest, yet most important, prerequisite to getting the most out of “Big Data.”  By knowing what the business needs to solve, analysts must also consider the audience the data mining effort is targeting; whether data mining is necessary; the availability of data and the conditions under which it may be used; and the experience of the business experts.  Effective problem definition begets data mining efforts that produce insights a company can act upon.

Read All About It: Why Newspapers Need Marketing Analytics

October 26, 2010

After nearly 20 years, I decided to let my subscription to the Wall Street Journal lapse. A few months ago, I did likewise with my longtime subscription to the Chicago Tribune. I didn’t want to end my subscriptions, but as a customer, I felt my voice wasn’t being heard.

Some marketing research and predictive modeling might have enabled the Journal and the Tribune to keep me from defecting. From these efforts, both publications could have spotted my increasing frustration and dissatisfaction and intervened before I chose to vote with my feet.

Long story short, I let both subscriptions lapse for the same reason: chronic unreliable delivery, which was allowed to fester for many years despite numerous calls by me to their customer service numbers about missing and late deliveries.

Marketing Research

Both newspapers could have used marketing research to alert them to the likelihood that I would not renew my subscriptions. They each had lots of primary research readily available to them, without needing to do any surveys: my frequent calls to their customer service department, with the same complaint.

Imagine the wealth of insights both papers could have reaped from this data: they could determine the most common breaches of customer service; by looking at the number of times customers complained about the same issue, they could determine where problems were left unresolved; by breaking down the most frequent complaints by geography, they could determine whether additional delivery persons needed to be hired, or if more training was necessary; and most of all, both newspapers could have also found their most frequent complainers, and reached out to them to see what could be improved.

Both newspapers could have also conducted regular customer satisfaction surveys of their subscribers, asking about overall satisfaction and likelihood of renewing, followed by questions about subscribers’ perceptions about delivery service, quality of reporting, etc. The surveys could have helped the Journal and the Tribune grab the low-hanging fruit by identifying the key elements of service delivery that have the strongest impact on subscriber satisfaction and likelihood of renewal, and then coming up with a strategy to secure satisfaction with those elements.

Predictive Modeling

Another way both newspapers might have been able to intervene and retain my business would have been to predict my likelihood of lapse. This so-called attrition or “churn” modeling is common in industries whose customers are continuity-focused: newspapers and magazines, credit cards, membership associations, health clubs, banks, wireless communications, and broadband cable to name a few.

Attrition modeling (which, incidentally, will be discussed in the next two upcoming Forecast Friday posts) involves developing statistical models comparing attributes and characteristics of current customers with those of former, or churned, customers. The dependent variable being measured is whether a customer churned, so it would be a 1 if “yes” and a 0 if “no.”

Essentially, in building the model, the newspapers would look at several independent, or predictor, variables: customer demographics (e.g., age, income, gender, etc.), frequency of complaints, geography, to name a few. The model would then identify the variables that are the strongest predictors of whether a subscriber will not renew. The model will generate a score between 0 and 1, indicating each subscriber’s probability of not renewing. For example, a probability score of .72 indicates that there is a 72% chance a subscriber will let his/her subscription lapse, and that the newspaper may want to intervene.

In my case, both newspapers might have run such an attrition model to see if number of complaints in the last 12 months was a strong predictor of whether a subscriber would lapse. If that were the case, I would have a high probability of churn, and they could then call me; or, if they found that subscribers who churned were clustered in a particular area, they might be able to look for systemic breakdowns in customer service in that area. Either way, both papers could have found a way to salvage the subscriber relationship.


The Man Who Feared Analytics

June 9, 2010

A business owner had once been referred to me by a colleague with whom he had already been doing business. For many years, the businessman’s photography business had been sustained through direct mail advertising, and he often received a 5%-7% response rate, an accomplishment that would boggle most direct marketers. But the recent economic downturn combined with photography’s being a discretionary expense, he soon found his direct mail solicitations bringing in a puny 0.8% response rate. The business owner had a great product, a great price, and a great offer, but at that response rate, he was no longer breaking even.

My colleague and I spoke with the businessman about his dilemma. We talked through his business; we looked at his most recent mailer, learned how he obtained his mailing lists, and discussed his promotion schedule. We found that the photographer would buy a list of names, mail them once, and then use a different list, not giving people enough opportunity to develop awareness of his business. We also found that he didn’t have much information about the people he was mailing.

We recommended that analytics could help the photographer maximize his margin by improving both the top and bottom line. Analytics would first help him understand which customers were responding to his mailings. Then he could purchase lists of people with characteristics similar to those past respondents. His response rate would go up, since he would be sending to a list of people most receptive to his photography. He would also be able to mail fewer people, cutting out those with little likelihood of response. He could then use the savings to remail the members of his target segments who hadn’t responded to his earlier mailing, and thus increase their awareness. It all sounded good to the photographer.

And then, he decided he was going to wait to see if things got better!

Why the Fear of Analytics?

The photographer’s decision is a common refrain of marketers. Marketers and business owners who are introduced to analytics are like riders on a roller coaster: thrilled and nervous at the same time. While marketers are excited about the benefits of analytics, they are also concerned about its cost; they’re afraid of change; and they’re intimidated by the perceived complexity of analytics. We’ll tackle each of these fears here.

FEAR #1: Analytics could be expensive.

REALITY: Analytics is an investment that pays for itself.

The cost of analytics can appear staggering, especially in lean times. Some of the most sophisticated analytics techniques can run into tens – if not hundreds – of thousands of dollars for a large corporation. However, for many smaller companies, analytics can run a few thousand dollars, but still a lot of money. But analytics is not an expense; you are getting something great in return: the insights you need to make better informed marketing decisions and identify the areas in your marketing that you can improve or enhance; the ability to target customers and prospects more effectively, resulting in increased sales and reduced costs; and the chance to establish long-term continuous improvement systems.

Had the photographer gone through with the analytics for his upcoming mail, the entire analysis would have cost him somewhere between $1,300 and $1,800. But that fee would have enabled him to identify where his mailings were getting the greatest bang for his buck and he might have made up for it in reduced mailing costs and increased revenues. Once the analytics had saved or made the photographer at least $1,800, it would have paid for itself.

FEAR #2: Analytics means a change in the way we do things.

REALITY: Analytics brings about change gradually and seamlessly.
The photographer had been using direct mail over and over again, because it worked over and over again – until recently. In fact, having lost so much money on his recent direct mails, he’s probably leery of new approaches, so he stays the course out of familiarity. That’s quite common. But this is the nice part about analytics: change can be gradual! Analytics is about testing the waters, so to reduce risk. Perhaps the photographer could have done a test where half of his mailings were executed the traditional way, and half done the way the analytics recommended. Over the course of a short period, the photographer could then decide for himself what approach was working best.

FEAR #3: Analytics is “over my head.”
REALITY: You need only understand a few high level concepts.

Those complicated and busy mathematical formulas, in all their Greek symbol glory, can be intimidating to people who are not mathematicians, statisticians, or economists. In fact, even I get intimidated from those equations. We must remember, however, that these formulas were developed to improve the way we do things! With analytics, all you need to know is what approach was employed, what it does, why it’s important, and how to apply it – all of which are very simple. Analysts like me deal with all the complicated stuff – finding the approach, employing it, debugging it, refining it, and then packaging it in a way that you can apply seamlessly. And if you don’t understand something about the analytical approach employed, by all means, ask! And any good analyst will give you all the guidance you need until you’re able to apply the analytics on your own.

Forgoing Analytics Can Cost Your Business Three Times Over!

Analytics is one of those tools that many marketers know can enhance their businesses, yet decide to hold off on – either for cost, perceived complexity, or just plain fear. This inaction can be very dangerous. Analytics is not just a tool that improves your business decision making; it also helps you diagnose problems, identify opportunities, and make predictions about the future. Failure to do these properly costs you in three ways. First, you market incorrectly, wasting money. Second, you market to the wrong people; they don’t buy, and you lose revenue you could have made marketing to the right people. Third, you fail to recognize opportunities, and you forgo any sales those missed opportunities may have brought. Analytics is an investment that pays for itself, pays dividends down the road, brings about change in an easy and acceptable way, and whose benefits are easy to grasp and financially rewarding.