The day that we, at La MaMa’s Archives, launched our digital collections site, I had one sharp regret: I hadn’t finished installing an analytics feature into the site before we went live. I was working on it, it was something I knew I should do – but in the mad rush of all the other things I had to manage to get the site launched, it had fallen to the bottom of the priority list. I just didn’t have time before launching to do the work to make it happen, in part because there’s no simple switch to turn on site-tracking within the Pawtucket configuration profile.
But as soon as the New York Times published a short piece about the launch of our new portal, I knew I had miscalculated. Hundreds of people (thousands maybe!) are visiting our site right now I thought. But I can’t prove it, and I can’t track how they’re using browsing and searching once they’re there.
I couldn’t go back in time, but I could install some good analytics software so that we’d be able to track views and visits to our digital collections site going forward. My experience building WordPress sites (both .com and .org) meant I was used to being able to track site views, visits, referrals, search terms, and other data about website usage. But how to track that data within the context of a CollectiveAccess platform? An old thread in the CollectiveAccess user forum suggested that Google Analytics might come prepackaged with the Pawtucket package, but I couldn’t find much support documentation for enabling this function. (And perhaps more importantly, digital librarians and digital privacy researchers are increasingly warning archives and libraries against using Google Analytics.) I happened to mention my interest in installing an analytics data tracker to La MaMa’s new IT consultant (the amazing Palante Tech Cooperative!) and they suggested that I check out the free, open source analytics platform Piwik.
Piwik turned out to offer a great solution for us. Its creators have developed a clear step-by-step installation procedure that doesn’t require tons of technical skill, and once installed, it offers a diversity of stats to help you understand who’s using your site and how. (However, there are other reasons that it might not be a great solution for every institution, as librarian Scott Young recently noted on Twitter.)
You do have to have some basic skills in using the command line or an FTP client to complete the installation process. You also have to be willing to poke around a little bit in the CollectiveAccess directories on your server. But the bar is not ridiculously high, and in the end, installing Piwik offered me an opportunity to learn a great deal about my CollectiveAccess instance and the internal organization of the Pawtucket directories and files. So the whole experience was a pretty good one for me.
Because Piwik offers a clear step-by-step install guide, there’s no need for me to rehash every detail here. But there were a few places in the process where I got stuck. Below is a brief description of some of those sticking places, with some comments about how I resolved these issues. I should note that these comments are based on my experience installing Piwik on La MaMa’s front-end (catalog.lamama.org), which is built on Pawtucket version 2.0. Additionally, my comments assume that you:
- already have a Pawtucket site configured and ready to go;
- know how to use an FTP or SFTP client such as Cyberduck;
- have access to some Linux-knowledgeable tech support; and
- can get access to your CollectiveAccess webserver.
If you can’t access your CollectiveAccess files because your site is hosted by an external hosting service, you may be able to use Cloud-Hosted Piwik (you can find a guide to Cloud-Hosted Piwik here). And if you don’t know what an FTP client is, or don’t know how to use one, this resource from Wired.com offers a pretty comprehensive introduction.
Notes on Installing Piwik for CollectiveAccess Users
A 5-step guide to installing Piwik is available at http://piwik.org/docs/installation/. Steps 1 (Getting Started) and 2 (Start the Installation) were pretty self-explanatory and straightforward:
Step 1 is mostly informational. It informs users of minimum webserver and database requirements for installing Piwik, and notes that, to install it on your site, you need to have an S/FTP client and access to your server.
Step 2 guides users through the process of downloading and unzipping the Piwik files. Then it instructs you to use your SFTP client to upload these files in binary mode “to the desired location on your webserver.” It worked fine for me to upload these files to the html folder inside my CollectiveAccess file package (/var/www/html).
Step 3 (The 5-Minute Piwik Installation) gets into the nitty gritty of the install. I hit a few snags here; for me, this step took a lot more than five minutes. The first issue occurred in the first part of this step, which asks users to “open your web browser and navigate to the URL to which you uploaded Piwik.” “If everything is uploaded correctly, you should see the Piwik Installation Welcome Screen. If there are any problems, Piwik will identify them and help you out with a solution.”
In my case, there were some problems. First, it seemed that I needed to make some changes to my PHP configuration. The error message I got looked something like the one displayed in the screenshot below, which I grabbed from the Piwik users’ forum:
Helpfully, Piwik offers instructions for how to correct the error. In this case, it instructed me to “set the following in your php.ini file: always_populate_raw_post_data=-1″ and, “after making this change,” to “restart” my web server.
I found my php.ini file easily enough – by going to the root folder of my server, and then to the folder called “etc” (/etc/php.ini). (For additional info about how to locate your php.ini file, check out the relevant section in the CollectiveAccess installation instructions page here.) But I couldn’t restart my webserver on my own, so I called in my IT support team support to do that (thanks Jamila!).
I also had to call IT support in to fix the second error Piwik identified, which appeared under the category of “Tracker status.” The error message was “GET request to piwik.php failed.” Piwik recommended that I “try whitelisting this URL from HTTP Authentication and disable mod_security (you may have to ask your webhost). For more information about the error, check your web server error log file.” Completing this task was beyond my skill-set. But my IT support had no trouble fixing this issue.
I moved on, but encountered another snag in the next part of Step 3, the MySQL Database Setup. Here, Piwik needs you to enter info about the database server, log-in name, password, and database name. But I didn’t know where to find this information – so I had to ask the developers at Whirl-i-Gig directly where this information lives. Turns out all this information can be found in a document (setup.php) within the admin directory (/var/www/html/admin/setup.php). If you navigate to that file (again, using either the command line or your SFTP client), you’ll find it contains all the info one needs to complete this part of Step 3 – the name of your database server host, the database log-in username, the database log-in password, and the name of your CollectiveAccess database. Cut and paste this info into the Piwik installation interface, and you’ll be good to go.
Step 4 (Configuring Piwik) and Step 5 (Medium and High-Traffic Websites) offer self-explanatory instructions for additional optional configurations, and I didn’t hit any snags there.
So that was it; I was done!
Once installed, Piwik will give you clear instructions for where to find your new analytics dashboard. That simple but extensive dashboard (which looks something like the image below) will then let you track visits, views, and other stats related to user engagement of your site. And so far, I’m really happy with it.