Spreadsheet Tips and Tricks for EBSCO Usage Reports Melissa Belvadi University of Prince Edward Island October, 2017
Types of Usage Reports ● COUNTER R4 - Book, Journal, Database, Platform ● Standard (EBSCO) - Title, Database, Interface, Link Activity, Login ● Ebook Subscription Usage Report ● Ebook CAM reports (Concurrent Access Model) ● Full Text Finder Reports ○ FTF - "resource" = title ○ PFI ○ Knowledgebase Changes - not usage ● Other lists - EBSCONET, ECM Prices, Database coverage
Spreadsheet Vocabulary Basics File = Spreadsheet (in Excel File = Workbook = Book) Sheet = Tab (in Excel = Worksheet, Table)
Download / Email and File Formats Prefer tab-delimited - .txt / .tsv ● ● ● ●
change to .tsv for Google Sheets Excel ok to leave .txt Comma (csv) can get corrupted Excel format can have merged cells
Best Practices: File Management ● Always (!) make a copy of original data file, save in separate folders ● Naming convention - name originals and copies in pattern, taking sort order into account, for example: ○ ○ ○
EBSCO COUNTER JR1 2016 (original) aa EBSCO COUNTER JR1 2016 (editing copy for analysis) start with "aa" or something similar ■ files alphabetically higher if in same folder ■ see it first if names truncated in display - don't accidentally edit original esp. if using search to find
Best Practices: First Edits to Sheet - 1 ● Rename sheet to something like "main" or "raw" or other specific ("JR1") - keep short, no spaces/punctuation ● Row 1 = column headers, nothing else, row 2+ = data ○ All other header info - new sheet, copy info including summary totals, rename sheet to something like "JR1 header", delete those rows from "JR1" ○ Check bottom for totals, also move to other sheet ● Delete unneeded blank rows at bottom, columns at right ● Remove columns with useless/constant data (Platform)
Best Practices: First Edits to Sheet - 2 ● Freeze top row: View - Freeze or drag fuzzy borders ● Wrap text on header row ● Add new first column - "original order" - 1,2,3 - use autofill to complete down (little square in bottom right corner of cell) Autofill tip: stops if empty cell on left - always add new columns as "A" OR next to column that cannot be empty, e.g. Title
COUNTER report before first edits
COUNTER report after first edits
Data Types: Numbers, Text, Dates, others ● Type 1 = numeric, 2 = text (other types: boolean, errors) ● Dates are stored as numbers (type=1) ● What looks like a number may not be stored as one ● Data that you intend as text (ISBN,ISSN) may be imported as number, loses leading zeros ● Tell numbers versus text by default horizontal alignment numbers right, text left (dates right-align) ○ make column wide enough to see this! ● When entering data by hand that you want as text, start with apostrophe '
Best Practices: Normalize Data Common to Multiple Sheets ● ISBN ● ISSN ● Proprietary ID/Book ID
Normalize Books identifiers: ISBN: =text(substitute(A2,"-",""),"0000000000000") where column A has the hyphenated ISBN and may be 10 or 13 digits
Book ID: from text to numeric: =value(A2) from numeric to text: =text(A2, "0000000") for as many zeros as the longest possible value, for EBSCO, usually 7 digits
Normalized ISBN, types of orig order, ID and dates
Note Normalized ISBN column before ISBN - blank ISBNs kill autofill!
13
ISSNs: Normalize ISSNs to include the hyphen (to force type text) Basic: from number to text, restoring leading zeros: =text(a2,"0000-0000") BUT if value is already text (0123034X), no effect, so: =if(
)
type(a2)=2, left(a2,4)&"-"&right(a2,4), text(a2,"0000-0000")
ISSN Normalized
Best Practices: Extract Parts of Data for Analysis ● LC Class from entire call number ● Month, weekday, year from a full date value ● Hour from a date/time value
LC Call Number: Extract the LC class from Call # To get BF from BF541.3.W67 ●
AND() lets you join multiple conditions
=if(AND(iserror(value(mid(d2,3,1))), (mid(d2,3,1)".")), left(d2,3), if(iserror(value(mid(d2,2,1))), left(d2,2), left(d2,1)))
with commas ●
"iserror()" returns true/false if function returns an error
●
value() returns an error if the string isn't a number
LC Call Number to Text Formula
Best Practices: Convert Formulas to Plain Text After normalizing/extracting data, convert entire column from formula to hard values ● Once fixed, only risk of accidental change, no benefit ● Performance drag or limit on large spreadsheets How: Column pulldown-menu or Select Column - right-click Copy, then Right-Click Paste Special - Values only
Converting Formula to Text BEFORE
AFTER (using Copy-Paste Special-Values Only
Vlookup - Combining data from different lists Must get tables into same file: 1. Have both files open (or third new file) 2. Right-click sheet tab - Copy to (Excel: Move or copy…) 3. "Recent" tab easiest way to find file to select ● Excel: Be sure to select checkbox "create a copy"! ● Rename the sheet-tab copied to a short name, no spaces
Vlookup - Combining data from different lists
Vlookup Need an "index" column that is basis for match - usually ISBN, ISSN, ID. Titles make very poor matches. Be sure your index column is normalized in both sheets. Decide which sheet is getting the data from the other. Target = the one getting the data Source = the one getting looked-up from Source: make sure the index column comes BEFORE the desired data to lookup.
Vlookup Example: Copy LC from DB to TitleUsage
Useful "Indexes" between EBSCO reports Books: Considering these 6 types of reports: ● ● ● ● ● ●
BR = COUNTER BR reports Sub = Ebook Subscription Report TitleStandard = TitleUsage standard report FTF = Full Text Finder Linking Reports ECM = EBSCOhost Collection Manager list export DB = Coverage list of Ebook subscription package
Useful "Indexes" between reports: Books ● ISBN - normalized! ○ DB and ECM have both pISBN, eISBN ○ TitleStandard = FTF = pISBN ○ BR = Sub = eISBN in ECM, DB
● Proprietary/book ID - normalized! ○ BR ID = TitleStandard ID, but totally different from: ○ Sub ID = ECM ID = DB (FTF has no ID column)
● Title? Usually unreliable, but: ○ Sub = TitleStandard = BR = FTF = DB ○
ECM titles don't match any other reports
Useful "Indexes" between reports: Journals Considering these 5 types of reports: ● ● ● ● ●
JR = COUNTER Journal Reports TitleStandard = TitleUsage standard report FTF = Full Text Finder reports ENET = EBSCONET price report DB = EBSCO database coverage lists
Useful "Indexes" between reports: Journals Almost always ISSN, BUT: ● Sometimes 2 ISSN columns, sometimes 1 ● If one, sometimes mixed pISSN, eISSN ● JR, TitleStandard, FTF all have pISSN and eISSN as separate columns ● ENET has 1 ISSN which could be p or e (sub) ● DB = pISSN
Vlookup with Wildcards and Two ISSNs - 1 Wildcard: if the target value to match is a substring of the source Step 1: Combine the two ISSNs into a single column: =E2&"; "&F2 Assume this is in column B in sheet JR1 which has usage in column D
Vlookup with wildcards and two ISSNs - 2 Step 2: In the target sheet, if the ISSN is in column D, put this in an empty column =vlookup("*"&D2&"*",JR1!B:D,3,0) ["3" is the distance from B to D]
Vlookup from EBSCONET $ to TitleStandard Usage TitleStandard sheet:
ENET sheet:
Vlookup with wildcards and two ISSNs - 3 If have two ISSNs in BOTH documents and not sure which might match to which, combine using nested IF and the method above: Do step 1 the same but: Step 2: in the target document if the two ISSN columns are D and E, do: =iferror(vlookup("*"&D2&"*",JR1!B:D,3,0), iferror(vlookup("*"&E2&"*", JR1!B:D,3,0), "no data") )
Vlookup Tips: ● Remember to convert column to plain text ● Plan ahead for pivot tables - put all columns you will want to pivot/report near each other
Ready, Set, Pivot! So now you have one sheet with all of the data you want to analyze. Ready for pivot tables!
(Did you forget to convert your vlookup columns from formulas to hard values?)
Pivot Tables ● Summarize lots of raw data for you - most often sum and count on categories: ○ publication year ○ LC class ○ EBSCO profile ○ publisher ● A critical tool for analysis of and creating charts of large data sets Excel looks different from Google, but functions the same
Pivot Tables ● Select all of the columns you want involved in the pivot first using the column letters (ok if extras between) ● Data - Pivot Table - accept defaults - range already selected, creates new worksheet ● Rows - add field first, then Columns if any, then Values (may be same as Rows) ● Decide if you want Values to be Summed or Counted use COUNTA for text values (COUNT ok for Excel) ● Filter - to remove unwanted categories, e.g. "(Blanks)"
Pivot Example Ebook use by LC class 28,000 lines of this...
Pivots to...
85 line summary table
Pivot Tables ● Will update in real time if data changes ● Sort/Filter on data sheet does NOT affect pivot tables ● Copy-Paste Special entire pivot table to another worksheet to clean up column headers, create groupings for subtotals, etc. (Excel easier to group related values in pivot table)
Pivot, manual grouping, chart
=sum(B1:B18)
Pivot Tables for comparing column data - Publisher DB Title List - use "Publisher" or "Contract Publisher"?
Questions?
Slide deck available:
[email protected]
Melissa Belvadi