ProfMongoose3154
Charles Book Club CharlesBookClub.csv Download…

Charles Book Club

CharlesBookClub.csv Download CharlesBookClub.csvis the dataset for this case study.

The Book Industry

Approximately 50,000 new titles, including new editions, are published each year in the United States, giving rise to a $25 billion industry in 2001. In terms of percentage of sales, this industry may be segmented as follows:

16% Textbooks
16% Trade books sold in bookstores
21% Technical, scientific, and professional books 10% Book clubs and other mail-order books
17% Mass-market paperbound books
20% All other books

Book retailing in the United States in the 1970s was characterized by the growth of bookstore chains located in shopping malls. The 1980s saw increased purchases in bookstores stimulated by the widespread practice of discounting. By the 1990s, the superstore concept of book retailing gained acceptance and contributed to the double-digit growth of the book industry.

Conveniently situated near large shopping centers, superstores maintain large inventories of 30,000-80,000 titles and employ well-informed sales personnel. Book retailing changed fundamentally with the arrival of Amazon, which started out as an online bookseller and, as of 2015, was the world’s largest online retailer of any kind. Amazon’s margins were small and the convenience factor high, putting intense competitive pressure on all other book retailers. Borders, one of the two major superstore chains, discontinued operations in 2011.

Subscription-based book clubs offer an alternative model that has persisted, though it too has suffered from the dominance of Amazon.

Historically, book clubs offered their readers different types of membership programs. Two common membership programs are the continuity and negative option programs, which are both extended contractual relationships between the club and its members. Under a continuity program, a reader signs up by accepting an offer of several books for just a few dollars (plus shipping and handling) and an agreement to receive a shipment of one or two books each month thereafter at more-standard pricing. The continuity program is most common in the children’s book market, where parents are willing to delegate the rights to the book club to make a selection, and much of the club’s prestige depends on the quality of its selections. In a negative option program, readers get to select how many and which additional books they would like to receive. However, the club’s selection of the month is delivered to them automatically unless they specifically mark “no” on their order form by a deadline date. Negative-option programs sometimes result in customer dissatisfaction and always give rise to significant mailing and processing costs.

In an attempt to combat these trends, some book clubs have begun to offer books on a positive option basis, but only to specific segments of their customer base that are likely to be receptive to specific offers. Rather than expanding the volume and coverage of mailings, some book clubs are beginning to use database-marketing techniques to target customers more accurately. Information contained in their databases is used to identify who is most likely to be interested in a specific offer. This information enables clubs to design special programs carefully tailored to meet their customer segments’ varying needs.

Database Marketing at Charles

The Club: The Charles Book Club (CBC) was established in December 1986 on the premise that a book club could differentiate itself through a deep understanding of its customer base and by delivering uniquely tailored offerings. CBC focused on selling specialty books by direct marketing through a variety of channels, including media advertising (TV, magazines, newspapers) and mailing. CBC is strictly a distributor and does not publish any of the books that it sells. In line with its commitment to understanding its customer base, CBC built and maintained a detailed database of its club members. Upon enrollment, readers were required to fill out an insert and mail it to CBC. Through this process, CBC created an active database of 500,000 readers; most were acquired through advertising in specialty magazines.

The Problem: CBC sent mailings to its club members each month containing the latest offerings. On the surface, CBC appeared very successful: mailing volume was increasing, book selection was diversifying and growing, and its customer database was increasing. However, their bottom-line profits were falling. The decreasing profits led CBC to revisit its original plan of using database marketing to improve mailing yields and to stay profitable.

A Possible Solution: CBC embraced the idea of deriving intelligence from their data to allow them to know their customers better and enable multiple targeted campaigns where each target audience would receive appropriate mailings. CBC’s management decided to focus its efforts on the most profitable customers and prospects and to design targeted marketing strategies to best reach them. The two processes they had in place were:

Customer acquisition:
New members would be acquired by advertising in specialty magazines, newspapers, and on TV.
Direct mailing and telemarketing would contact existing club members.
Every new book would be offered to club members before general advertising.
Data collection:
All customer responses would be recorded and maintained in the database.
Any information not being collected that is critical would be requested from the customer.

Targeting promotions were considered to be of prime importance. Other opportunities to create successful marketing campaigns based on customer behavior data (returns, inactivity, complaints, compliments, etc.) would be addressed by CBC at a later stage.

Art History of Florence

A new title: The Art History of Florence, is ready for release. CBC sent a test mailing to a random sample of 4000 customers from its customer base. The customer responses have been collated with past purchase data. Each row (or case) in the spreadsheet (other than the header) corresponds to one market test customer. Each column is a variable, with the header row giving the name of the variable. The variable names and descriptions are given below

Variable Name

Description

Seq#

The sequence number in the partition

ID#

Identification number in the full (unpartitioned) market test dataset

Gender

0=Male,1=Female

M

Monetary—Total money spent on books

R

Recency—Months since the last purchase

F

Frequency—Total number of purchases

FirstPurch

Months since the first purchase

ChildBks

Number of purchases from the category child books

YouthBks

Number of purchases from the category youth books

CookBks

Number of purchases from the category cookbooks

DoItYBks

Number of purchases from the category do-it-yourself books

RefBks

Number of purchases from the category reference books (atlases, encyclopedias, dictionaries)

ArtBks

Number of purchases from the category art books

GeoBks

Number of purchases from the category geography books

ItalCook

Number of purchases of the book titled Secrets of Italian Cooking

ItalAtlas

Number of purchases of book titled Historical Atlas of Italy

ItalArt

Number of purchases of book titled Italian Art

Florence

= 1 if The Art History of Florence was bought; = 0 if not

Data Mining Techniques

Various data mining techniques can be used to mine the data collected from the market test. No one technique is universally better than another. The particular context and the particular characteristics of the data are the major factors in determining which techniques perform better in an application. For this assignment, we focus on four fundamental techniques:

K Nearest Neighbors
Classification Trees
Logistic Regression

In the direct marketing business, the most commonly used variables are the RFM variables:

R = recency, time since last purchase
F = frequency, number of previous purchases from the company over a period
M = monetary, amount of money spent on the company’s products over a period

The assumption is that the more recent the last purchase, the more products bought from the company in the past, and the more money spent in the past buying the company’s products, the more likely the customer is to purchase the product offered.

For The Art History of Florence, CBC wants to use the following explanatory variables (including R, F, and M) in the data to predict whether a customer would order the book.