Big Data Will Revolutionize Higher EducationWilliam Morse Jr. | Former Vice President and Chief Information Officer, Pomona College
The following interview is with William Morse Jr., chief technology officer and associate vice president for technology services at the University of Puget Sound. Morse Jr. has shared some interesting ideas regarding how “Big Data” will improve higher education management in the future and, in this interview, he discusses his vision for the future role of data in higher education management and shares his thoughts on the biggest challenges facing the implementation of tools geared toward this end.
1. You recently spoke about the value “big data” can bring to higher education management. What is “big data”?
Quite simply, it’s lots of facts, it’s lot of information, just about anything that you can imagine that occurs at an institution. And some of that data and facts can be seemingly unrelated. But you want to collect [those] facts so that you can begin … putting it into a database as reportable. It allows you to do a predictive analysis and it’s much greater than the traditional transactional data that you get at most institutions; that’s the nuts and bolts of how an institution works, running from financial aid, admissions and the grades, for example, a student does. It’s much more than that. And so, the question is, “Where is all this data? Where does it come from?”
And I think that most institutions, if you really look at it, they already have it. They just haven’t collected it in a way that is reportable, that gives a holistic view of their institution. So the data, for example, is in their admission process. We collect a lot of information about a student that we don’t actually use later down the road, but that helps us make sure that student is a good fit for the institution. We do it in our co-curricular transcripts that many institutions are making. That’s the, “What does the student do outside the academic program? What are the clubs they’re in, what are the organizations, what’s their position within those organizations?” A lot of universities are now producing those transcripts in addition to the traditional transcript.
We have information in our learning management systems, our LMS. That’s, “What does the student do online? How [much time] do they spend? What are they reading? What are they doing in the class?” We have that information; that information may go today to evaluate the student’s performance in the class. But that’s really where it stops.
We also have some additional data mining opportunities … in the traditional transactional system. So, we look at financial aid as a process, as a transaction we do. We calculate how much a student might need or what not. But that information is incredibly useful if you analyze it for a big data situation because then it helps you predict how that student is going to be attached to the institution and other such. So, that’s a simple definition of what big data is.
2. How would better access to, and organization of, data change the way you manage your institution?
Well, many institutions today are already using big data in a smaller format. So, for example, admission. Admission as a field has them using big data to predict which students are actually going to come to the institution so that you can concentrate your limited resources on students that are more likely to come. They use it to also get the students that fit within the institution’s needs. A student that’s going to be most successful and yet have financial aid that is going to work for that institution.
But what we’re talking about here is greater. It’s not going to just stop at the admission door, where we get the students in the door. The idea here is we’re going to target our work, our efforts, towards recruiting students who are going to actually come to the university and then be successful and stay through graduation. And if you look at a lot of the requests that are being made on higher ed, we’re going to be facing a time of limited resources and those resources are going to be targeted towards universities that are very good at getting students, getting them to stay and graduating them. I think that we’re going to be asked to produce a product that results in success of students and I think we’ve heard that just recently in the State of the Union Address from Obama. So, that’s the real power of this system. You can get this data and you can do correlative analysis to see how this student traditionally — or one like them — they tend to come to the university and be successful.
But it has other uses as well. For example, retention. If you look at most institutions today, looking at a student that might be getting in trouble, for example, one that might be having a tough time. Retention is done through a haphazard methodology. A faculty member may say, “Oh, this student’s having trouble. And, I’m going to report that.” And then they begin work. But a big data system may look at all the characteristics of that student and do some predictive analysis and say, “Okay, this student is not participating in co-curricular activities or grades seem to be going down.” It’s going to flag that student and so we can be proactive before it gets to the point of where the student is really in trouble, because it’s a point where the student is really in trouble that the faculty member is going to reach out and make that corrective step.
And so, it can also be used in, for example, looking at a student’s choice for majors or for a particular class. So, a faculty advisor can say, “Based on your past, or based on some characteristics, maybe tests you’ve taken or other things, this class may not, or this major may not, be for you.” So it helps us keeps those students moving in a direction where they will be successful, because a lot of students come to universities and maybe their parents have said, “We want you to become a doctor,” or something along that line. … We don’t want to squash their dreams, but we can give them advice on how to be more successful in going to the university.
In another area I think is really exciting is that many institutions are trying to make themselves more successful with first-generation college students. We’re going to have a huge population of potential students who are going to be first-gen and they have particular needs. So, colleges are looking for ways to make the institution accessible to them so that they will be successful. And so, by looking at students that come through that are first-gen and seeing how they were successful, you can do, again, predictive analysis and maybe design a program that is going to work for them. …
You can track trends and majors; certain majors over time become more popular than others. This will help you make that analysis so that you have the right number of faculty members and you can offer the classes that students want, because if your students aren’t happy with the class mix that you have, well, they’re going to go to another place.
So, I think that this is just a huge opportunity for institutions and I couldn’t be more excited about it.
3. What tools would be required for an institution to go about mining for this data?
Well, that’s a good question. And it’s one that people often ask: “What am I going to need to do this?” Because when you’re getting into databases, those things can be challenging. So you need to go about this in a way that’s organized. So, you’re going to need to have a central database system where this data is going to be collected. And you’re going to want to have that be the system of record for reporting. So, you’re going to need to build links from your existing data system into this system and normalize that data. What I mean about that is to have the data link correctly. You’re going to also want to make sure that the data is linking in a way that doesn’t change the meaning of the data. So, you’re going to want to have a data standards committee that looks through and says, “This term means this.” Because the worst thing you can do is have your data linked in a way that changes the meaning. And that really damages your database and makes it give you analysis that isn’t true.
And then, of course, you’re going to have to have the reporting tools that allow people to go in there and easily ask what-if questions. The neat thing about big data is that you have lots of information about a situation, a person, and you can go in there and you can do what-if scenarios. Things that seem to be unrelated, you can say, “Well, if a person is taking this class and they have this amount of extracurricular activity, or they’re doing this work for extracurricular activity, does that somehow affect their success? Or is there something that is going to result in the understanding better, how that student is going to perform or what I might be able to do as an institution, what services I might be able to supply to that student to help them perform better?”
So, you’re going to have those tools be easy to use because you’re going to want those tools widely distributed so that people can actually do that in the wide basis. And that’s the power. You really want to empower people to go out there and do this work.
And, finally, you’ve got to have … great people within your IT unit that understand how to build systems like this. Certainly, building systems of this nature are complex and you have to have the right mix of people to make sure that it’s done correctly. And you’ve got to have the right group of people that can listen to your constituents so that you’re making it useful. If you’re building a beautiful and elegant data system that isn’t useful to the community, then you have not done anything at all; it’s not actually achieved any goals. So, I think of those as the types of things to keep in mind as you’re beginning to think about that process.
4. How would the data be stored in such a way that it could be accessible and asked the what-if questions that make it functional?
Well, you’re going to need a relational database and that’s pretty much the key core right there. …
I mean, many universities have multiple databases that they use but they work in a transactional system, so you need to understand where the authoritative data stands within those systems. So, you’re going to want to know that this data here is the true data in the institution and clean that up. And, particularly, you’re going to want to also look at the shadow systems — universities are notorious for having shadow systems — and minimizing those so you really have the right data. And then you’re going to flow it into the central database … with your data standards and understanding what the terminology means and having that be consistent across the institution in a way that links that data correctly. So, there you’ve got that central database that’s going to be relational that allows you to do this analysis and, in those, tools are going to be available to them to access it.
So, you’re going to want to make sure though, as you do this, that you’ve got your [Family Educational Rights and Privacy Act] rules in place and understand what the data can and cannot be used for and understand what the meaning of the data really is. You wouldn’t want to have this data, for example, widely publicized on the web, but you definitely want to make it accessible enough so you can do this type of predictive analysis and you can do it in there to give people the access to do generic-sized research without getting into specifics that they’re not supposed to have that type of access.
So, I think that that’s the kind of keys that I would be thinking about of how I would do this and how the data would be stored. And it’s also something I would think you need to keep in mind that it’s organic so that, as you design this system, you can add different and new data as things come along as well. So you don’t want to have this locked in stone so that it’s inflexible.
5. Looking into the future, what are some other tools that would make managing your institution easier?
I’m thinking about this project as we’re doing this right now, because we have this project underway. We have, at the University of Puget Sound, … completely redesigned our transactional data system, our infrastructure, our … Enterprise Resource Planning system, from the ground up. And we did that with the goal of being able to do this type of analysis. I think that we, here, are very aware of the changing times and we want to be as efficient as we possibly can. And part of that is being able to look at our data from a holistic point of view, which this big data system is going to do.
But, in trying to put this system together, I would say that the most complex thing that we’re having to deal with is getting the data out of these other systems and into the new system. You have to build those links manually right now. You have to go into the various data systems and import that data into it. Let’s say you’re pulling in data from your LMS, you’re pulling it in from the admissions system or co-curricular transcripts or whatever else; those are all manual and you’ve got to place it in the database in the right place. You’ve got to make a thoughtful decision. And, right now, there’s not an out-of-the-box solution that will help you collect that data in a way that makes it easier. So you have to have somebody on staff who understands how to link that data in a way that still continues to make sense. So, I’m looking forward to a day — and I think all the major ERP [enterprise resource planning] vendors are making great progress on this — where these type of things can be more delivered so it’s more plug-and-play and that would make it more accessible to other institutions.
6. What are the biggest roadblocks standing in the way of the implementation of these tools?
I think the biggest roadblocks to this are, well, we kind of talked about this a little bit, is that it does take a lot of effort. And it does take a thoughtful process. You have to almost data-architect your institution and that’s hard with legacy systems.
But I would say the biggest thing that would be holding people back is fear and even the fact that this has now hit the scene in a big way and people are looking at it as, “Ah, this is just another IT fad.” If you’ve been around in IT as long as I have, I can’t tell you how many times people said, “Well, this technology is going to change everything and it’s going to make it a whole new day!”
I think the difference this time is that this tool really is something that is more settled. It’s something we know how to do. And, if you think about it, the base-core of what we’re talking about was a fad, 10 to 15 years ago; it’s a data warehouse and, back in those days, they were simply looking at normalizing their transactional data, getting that in one place. Because there were many more databases in the past that institutions had to deal with.
Over the last 10 to 15 years, we’ve moved to a more centralized ERP structure in most cases. They wanted an authoritative database, they wanted to have some place where this data was true because they’d have, for example, addresses in the admissions system, addresses in the alumni system; which one is the right one? They wanted to have that authoritative database. And they wanted to make it relational. Ten to 15 years ago, these databases weren’t relational; many of them were flat and, so, they wanted to have a system that would make these careers in reporting. It seemed like a wonderful idea back then, but in many cases, we’ve thought it was an absolute disaster. They were a disaster 10 to 15 years ago because the tools weren’t right, they didn’t have good process in doing this, the links to … external systems were really hard to do, they were all batched. So, you could imagine a batched process if it wasn’t programmed correctly, could overwrite good data from some other system. And the reporting tools just weren’t up to date.
So many of the institutions have said, “We’re not going to do this project anymore.” Or it didn’t achieve the goals they were looking at. So if you go to EDUCAUSE and say “data warehouse,” you get this look from people’s faces, like, “Oh my goodness, run from the room!”
But the neat thing about this is that, while we in higher ed struggled with data warehouses, the folks in corporate, they figured it out, they kept going and looking at this and working towards resolving the challenges they dealt with working with that type of tool. And they figured it out. They also said, “Oh my goodness, we can collect a heck of a lot more data than just the transactional data, we can add all this other stuff and get a much better, much more holistic view of the people we’re trying to work with, so we can micro-target them, we can gather information on them and better sell our product.” Well, those lessons that they’ve learned, I think, are now what this really is. It’s coming back to us as big data, so it’s not a fad, it’s not something to be afraid of. They know, in the IT community, how to do this. The trick is now putting those tools together in a way that actually results in this data. And, I think, I named from the things … we’re already collecting this data, you’re already doing it, you’re doing it in your LMS. If you have a CRM [customer relationship management], for example, you’re doing it in tracking those emails that those students are getting. You have that data; the trick is just finally just taking that next step and getting it into this database and, again, I think we have the tools, we have the technology — and the reporting tools are a heck of a lot easier to use now to actually make this successful.
And, if we can, I think the results of offering the student, recruiting them here and saying, “You know what, we’re recruiting you and we know, we feel, that based on your characteristics and path of some of the other students that you’re going to be successful here.” And what a wonderful promise that is because, right now, it’s still a mystery. A parent might be saying, “Is my student going to be happy here or successful here?” … It’s worth the risk and the work that goes into doing this because the rewards are going to be amazing.
7. Is there anything you’d like to add about the growth of big data and how technology is going to change higher education management in 50 years’ time?
Well, if you work in IT, it’s all about change. And it’s all about new things and adding what I’d like to call, “new tools in the tool box.” And, as fads have come and gone, those have all added new tools. I just think when you work in higher ed, you’ve got to be excited about what you do. You’ve got to look at it with this energy and I think that’s even more true today, because IT in higher ed is going to be essential to our being able to address the concerns that are being asked of us.
We’re going to have to be looking at how we can be efficient, how we can make those dollars stretch so that we can continue educating the next generation in the way we’ve been successful at for 100 to 200 years. I think times are changing, and we’re just going to have to recognize that IT is going to be essential to that. And we need to, as a service organization, be out there, proactive in our partnership with the administration, so that we can continue helping our schools to be successful.
Author Perspective: Administrator
Collecting and analyzing big data is helpful to universities in the long run. It’s more cost effective when an institution invests resources to attract and retain students who are a good fit for the school. It creates a better experience for the student and a smoother process for the institution.
He does a good job of identifying uses of big data beyond admissions and marketing. Like Morse, I believe that big data can be extremely valuable for determining how an individual learns and what type of support he/she needs to succeed at the institution. Morse suggests — and I agree — that big data should thus be applied to develop initiatives aimed at improving retention.
While this is a neat idea, it also sounds like it will be expensive to develop a system capable of retaining and analyzing the scope of data being described in this article.