Oct 13

Machine Learning MOOC meetup

We’re organizing a meet up for Coursera’s machine learning class taught by Andrew Ng (https://class.coursera.org/ml-004/class) . The first meet up will be Friday, October 18 at 3:00 pm in 201A Millennium Science Complex.

The purpose of this meeting is to provide an opportunity for the local machine learning practitioners and data scientists to meet in person, organize a MOOC study gathering, and celebrate their accomplishments and newly acquired knowledge. In addition to discussing the class, later meetings may include mini-lectures / workshops from experts in the field.

If you would like to join this event, please register through this link:


May 13

1st Geo-Tweet Seminar

Zhuojie Huang is organizing a Geo-Tweet Seminar. The first one is scheduled at W201A Millennium Science Complex from 2:00 pm to 3:00 pm on 5/31/2013.

This room is with camera and video facilities so we can probably set up a Google Hangout for remote joining.

The purpose of this meeting is to enhance collaborations on the analysis of GeoTweets. The first meeting will feature a brief introduction of this seminar and our group, some reading materials, possible research opportunity and future directions. It is also a great opportunity to know all the expertise who are willing to work on the twitter data at Penn State.

The reading materials could be downloaded here


We might not be able to cover all of them at the first meeting, but I will try to setup a mendeley page for bibliographic organization and a wordpress page to record all the relative tools for analysis. I would like to keep a balance between technology and science, so more social network science paper will be put in the list for the next meeting.

If you would like to join this event, please register through this link:


The password to enter is GEOTWEET.

Feb 13

Hands-On Session 6: Sonification

UPDATE: Slides and code available at https://github.com/jrimland/supercollider

Friday, March 1 at 1PM, Jeff Rimland presents: “It Might Get Loud” — An Introduction to Sonification and SuperCollider.

Data-driven audio synthesis is becoming an increasingly useful technique for exploring complex data sets and phenomena ranging from the origins of the Universe to the correlation between Twitter and the stock market. SuperCollider is a powerful open-source software tool used by scientists (as well as musicians and sound effect technicians) for a broad variety of audio generation applications. This talk will include an introduction to sonification, a hands-on session for learning the basics of SuperCollider, and a discussion period to provide guidance on applying these techniques to your dataset analysis or audio generation goals.

SuperCollider is freely available for Mac, Windows, and Linux at http://supercollider.sourceforge.net/

Note: Please bring headphones if you would like to follow along with the hands-on session!

The session will be held in room 113 (a.k.a. the Cybertorium) in the Penn State Information Sciences and Technology (IST) building from 1-2PM on Friday, March 1.

Jan 13

Upcoming Hands on Sessions

We’ve got two exciting Hands on Sessions coming up! Location and time as usual.

Friday Feb 1st, 1 pm: Hands on Session 4: Using Amazon EC2 to set up a Hadoop cluster (with Todd Bodnar)
Location: Millennium Science Complex, room W-203A

Friday Feb 8th, 1 pm: Hands on Session 5: Using R to work with data (with Lindsay Beck-Johnson)
Location: Millennium Science Complex, room W-203A

Jan 13

Hackathon January 21st

With less than two weeks to go before Jan 21, we’d like to invite you once again to participate in the Hacking Science Hackathon.

The Hackathon is an opportunity to prototype an app, get your hands dirty with data problems you might be having, or help someone else with their problem. In the beautiful commons room on the 3rd floor of the Millennium Science Complex, we’ll start at 10 am with some coffee, bagels and assemble in groups looking to work on interesting problems. At 11 am, we’ll start the hacking!

Free food will be available throughout the day. After dinner at 7 pm, groups will demo their problem and their progress, after which hacking will commence into the night.

To make matchmaking of interests and skills easier, we are putting up two Google docs – one for ideas, and one for people / skills. Feel free to put up your own info and get in touch with others before the event.

Please register early for the event, it’ll make our lives much easier in organizing the event. Thanks!

PS New here? Don’t forget to get on the email list, follow us on Twitter @hackingscience. We run regular Hands on Sessions on all kinds of topics – for upcoming ones in January, see https://sites.psu.edu/hackingscience/2013/01/01/hands-on-sessions-in-january/ 

Jan 13

Hands on Sessions in January: Client-Side Programming, Mapping Tools & Git(Hub)

One of the core themes of Hacking Science is that execution is 99% (the other 1% being the idea). In that spirit, we are introducing Hands On Sessions – short overviews on a topic or theme, with lots of examples, and where participants are expected to bring their laptops / tablets.

We have the following Hands On Sessions lined up for the next three weeks:

Friday Jan 4, 1 pm: Hands on Session 1: Client technologies – HTML, Javascript / DOM, CSS, SVG etc. (with Marcel Salathé)
Location: Millennium Science Complex, room W-203A

Friday Jan 11, 2 pm: Hands on Session 2: Mapping Tools – Analysis and Visualization of Spatial Data (with Josh Stevens)
Location: Millennium Science Complex, room W-203A
(Slides & code + data available)

Friday Jan 18, 1 pm: Hands on Session 3: Version control & social coding with git and gitHub (with Alex Shovlin)
Location: Millennium Science Complex, room W-203A
(Slides available)

Dec 12

Location of Jan 21 Hackathon: Millennium Science Complex

Really excited to annonce the location of the Jan 21 Hackathon: the commons area on the third floor in the Millennium Science Complex. An absolutely amazing space!

Dec 12

Hacking Science Hackathon on Jan 21, 2013

We’re happy to announce the first Hacking Science Hackathon to be held on January 21, 2013 (Martin Luther King, Jr. Day).

The Hacking Science Hackathon is going to be a one day event where ad hoc teams of programmers and scientists work together to solve interesting problems of their choice. You might be looking for an interesting programming challenge; or have an interesting data set that you’d like to work on with other people; or have been looking for an opportunity to build a tool with others that helps you deal with data mining, analysis, visualization: this hackathon is for you.

Free food / drinks will be provided, and the day will be interspersed with short presentations on various practical topics (e.g. using git / Github, R, viz tools such as processing, D3.js, gephi etc.)

We’ll send out more information soon (location, logistics, registration etc.) but for now please save the date.

Also, if you have ideas on how to make this event a success, i.e. if you want to provide a short demo during the day on any topic, or if you have data you’d like others to work with, or want to contribute in any other way: please do get in touch at hackingscience@psu.edu.

Dec 12

What is Hacking Science?

Hacking Science has, after just two events, garnered a lot of attention on campus. In this post, I would like to describe what we are trying to do with Hacking Science, and outline some of the rules that I think should guide us.

In a nutshell, we want to create an ecosystem at Penn State for people to connect with each other and learn about the opportunities and challenges in the space where coding, big data, and scientific research come together. As a consequence, the success of this mission is completely dependent on our ability to build and maintain a community that is actively engaged in Hacking Science. We are planning on having regular events (meetups, unconferences, hackathons) to further our cause and grow this community.

Most people get Science, but what about the Hacking? Hacking Science can be interpreted in two ways: Hacking the Science, or the Science of Hacking. It really doesn’t matter that much. Hacking has often been associated with malicious intentions, but this type of definition for hacking is becoming increasingly outdated. Hacking has become a generic term to describe the process of finding creative ways, often involving coding, around an existing problem or roadblock. From hacker news to mind hacks to hackathons, the word hacking embodies the spirit that problems can be (and must be) solved with software and science. It is in this spirit that we use the term.

I’d like to outline a few basic premises of Hacking Science that I think we as a community must embrace (indeed not only embrace, but champion). In the first Hacking Science meeting a few weeks ago, I presented the following list of “rules”: 1. Forget that you are in academia, 2. Don’t wast time on definitions, 3. Welcome failure (and fail fast), 4. Change your mind often, 5. Ideas = 1%, Execution = 99%, 6. Share widely. To this list I would now like to add the 0th rule: talk about Hacking Science. None of these ideas are mine, let alone new, but let me still say a few word about how they relate to Hacking Science.

0) Talk about Hacking Science
In obvious reference to Fight Club, the 0th rule of Hacking Science is that you should talk about Hacking Science. Can you imagine what an amazing community this would be if everyone would know about this community? It’s probably a conservative estimate to assume that 1 out of 50 people on campus are interested in the themes of Hacking Science – that translates into 1000 people. Imagine we would get all of these people together, interacting with one another, sharing knowledge, and building great things. This can happen, but only if everyone knows about Hacking Science. So please tell your friends and colleagues about Hacking Science. A few tools:

1) Forget that you are in academia
Come as you are. Perhaps you are an undergraduate or graduate student. Perhaps you are a staff or faculty member. Perhaps you are a postdoc, or the president. Perhaps you are associated with departments, and perhaps your are majoring in some field of expertise. Great. Now, what can you bring to the table? What interesting problem, tool, solution, method, data set, skill set etc. do you have? What are you passionate about? At Hacking Science, we are all about solving problems. We are interested where you are going, not where you are coming from. What you should not forget about academia is that you are surrounded by an amazing group of very smart and interesting people.

There has been a lot of talk about the idea that going to college is somehow outdated. Take a look at a recent article in the NY Times, “Saying No to College”. The article starts by highlighting the case of a CS student who has had a frustrating college experience:

Two years ago, Mr. Goering was a sophomore at the University of Kansas, studying computer science and philosophy and feeling frustrated in crowded lecture halls where the professors did not even know his name.

“I wanted to make Web experiences,” said Mr. Goering, now 22, and create “tools that make the lives of others better.”

While most of us have experienced firsthand the frustration that large and crowded lecture halls can bring, this should not interfere with the motivation to make Web experiences and create tools that make the lives of others better. A research University like Penn State is full of people who want to make the lives of others better, and who have the projects to do just that. What many of them probably need – often desperately so – are motivated people with coding skills who can help them with these projects. Come to think of it, this would be a good measurement of success or failure for the Hacking Science community: if a motivated CS student with coding skills leaves the University because s/he thinks there are no opportunities to build tools that make the lives of others better, we will have failed. (If you are currently one of these students – please get in touch with me. I am working on many projects that are designed to help us reduce the burden of infectious diseases, and I very much need your help – your coding skills are a great asset.)

Even if your first priority is building a start-up, heed the advice of VC / startup Überlord Paul Graham expressed in a recent essay “How to get startup ideas”:

The clash of domains is a particularly fruitful source of ideas. If you know a lot about programming and you start learning about some other field, you’ll probably see problems that software could solve. In fact, you’re doubly likely to find good problems in another domain: (a) the inhabitants of that domain are not as likely as software people to have already solved their problems with software, and (b) since you come into the new domain totally ignorant, you don’t even know what the status quo is to take it for granted.

So if you’re a CS major and you want to start a startup, instead of taking a class on entrepreneurship you’re better off taking a class on, say, genetics. Or better still, go work for a biotech company. CS majors normally get summer jobs at computer hardware or software companies. But if you want to find startup ideas, you might do better to get a summer job in some unrelated field.

Or don’t take any extra classes, and just build things. It’s no coincidence that Microsoft and Facebook both got started in January. At Harvard that is (or was) Reading Period, when students have no classes to attend because they’re supposed to be studying for finals.

This is exactly what Hacking Science should be very good at – providing a forum for a clash of domains. Getting people exposed to new fields, getting them to work together. Build things.

2) Don’t waste time on definitions
In order to heed this advice, I’ll keep this section very short. It’s easy to get dragged into endless discussions on definitions. In the scientific world it is of course very important to specify exactly what you mean. But as a community, it should not be our first priority to spend too much time on thinking about clear definitions for terms like data science, big data, hacking, etc.

3. Welcome failure (and fail fast)
One of the most inhibiting features in the academic world is an excessive fear of failure. What’s wrong with being wrong? Not much, if you adapt quickly. In academia, people get easily attached to their ideas. This is probably true in general about humans, but it seems to be pervasive in academia because ideas are highly valued (see rule 5). If you have a great idea, you’ll be considered creative and smart – the idea might even be given your name. Many scientists will defend their idea to their deathbed (as Max Planck drily noted, science advances one funeral at a time). The reason for this is that the marketplace for ideas in the world of science has a very different time scale than the marketplace for ideas in business. If your scientific ideas are wrong, they will eventually be overturned, but this process might take decades (especially if they are hard to verify empirically). If your  business ideas are wrong, you will go out of business in a matter of months or even weeks.

What’s more, science is a game of rejection. Reading the scientific literature – or worse, the scientific press – you would think that all these successful scientists out there have had brilliant ideas, executed them flawlessly, and the world came to cheer them on. If you think like that, you are falling victim to survivor bias – a distorted view of reality because you are only looking at a small, successful subset (akin to believing that dropping out of college is necessary for jobsian or gatesian success). Most papers get rejected. Most grant proposals get rejected. Indeed, there is even good evidence to suggest that most current research papers are wrong. And yet, scientists fear failure like almost nothing else. I suspect that the temporal dynamics play a huge role here. If you are working on a paper for two years, or if you grant proposal would fund your lab for another five years, people are understandably concerned if things don’t work out. The fact that their career is tied to these things doesn’t help.

In the Hacking Science community, we don’t have these issues, and we therefore should take advantage of that and embrace failure – and try to fail fast. If you build something in code, and it doesn’t work, fix it and move on. If it works but nobody wants to use it, move on. The faster you recognize this, the faster you will make progress.

It’s helpful to envision progress as moving along on a landscape, trying to find the highest peaks (where higher peaks mean more success, however you want to define success). In evolutionary theory, this concept is known as the fitness landscape on which populations of genotypes are pushed around by various forces, most famously by natural selection. If you are moving in a certain direction and you realize that you are actually moving away from a peak, you should change the direction. The faster you realize that you are going away from a peak, the better (this is not always true – sometimes you need to cross a valley to reach an even higher peak on the other side). Embracing failure simply means that you are not terrified of making the wrong move – you simply make the move, and then assess the situation objectively. If you realize you’ve made the wrong move, great – you now know more than you did before. Crucially though, you now need to act – and that often requires that you are willing to change your mind.

4. Change your mind often
Jeff Bezos has famously said that “people who were right a lot of the time were people who often changed their minds”. This makes complete sense if you think about moving along on the landscape envisioned in the previous paragraph. After you’ve made a move and realized that it was wrong, you just change your mind and move into a different direction. Simple, right?

Turns out that this is easier said than done. It’s hard to change your mind when your mind has already decided what’s right. It doesn’t always feel good either – if you change your mind often, doesn’t that mean you’re wrong often? If so, then what can you rely on, if you can’t even rely on what you think is right? What’s more, our culture doesn’t always value when people change their minds. We love people who stick to their guns. We love stories where people’s views were ridiculed at first, only to be vindicated after a long time. This is especially true in science – the real scientific hero in our culture is the one who is fighting an uphill battle his entire life, only to be vindicated in old age (being vindicated after your death makes for an even better story). But while these are great stories, they suffer from the same drawbacks as the stories of Jobs, Gates, et al. – survivor bias. We have yet to see the Hollywood movies of those guys who, in the face of increasingly contradictory evidence, never changed their minds, and in the end turned out to be… wrong.

Imagine you are at some random point in the landscape. Your goal is to get to the peak. You don’t know where the peak is, but you have the tools to measure your altitude. So here’s your strategy: measure your altitude, make a move, measure your altitude again. If it has increased, you keep moving in that direction; if it has decreased, you change direction. If you keep doing this, you will eventually get to the peak. The main insight here is that almost all moves will lead to an decrease in altitude – most of your moves will be wrong. And yet, if you keep measuring objectively, and instantly react based on these measurements, you will eventually get to the peak. You could argue that random movement is not an efficient strategy overall, and I would agree – on your way to the peak you would be overrun by people who have better intuitions for the next move pointing upwards, and by people who are smart to figure out patterns in the landscape. Nonetheless, even these people will make a lot of moves that are wrong. Eventually, the first person to reach the peak has perhaps a number of properties – smart, motivated, lucky, etc. – but the one property you are guaranteed to find is the ability to change mind often.

5. Ideas = 1%, Execution = 99%
This one is really simple. Having the idea of starting a Hacking Science community, or the idea to write this blog post, was easy. Getting things organized, spending hours and hours writing the damn thing, however, is painful. Nevertheless, you wouldn’t be reading this without me writing it (I told you it’s simple). You wouldn’t have attended a Hacking Science event without someone organizing it.

Despite the simplicity, it is astonishing how easily this is forgotten. In science, this is particularly true, presumably because the scientific idea is more glamourous – your name will be attached to an idea, not to the conclusive empirical evidence supporting the idea (see the example of Albert Einstein – but careful of survivor bias). Darwin is famously remembered for the best idea anyone’s ever had – when really it is the 20+ years of work it took him to collect some evidence to support the idea that did the trick. Don’t get me wrong – execution without an idea is bound to lead to failure. But idea without execution (or with bad execution) leads nowhere. Derek Sivers offers a great analogy:

It’s so funny when I hear people being so protective of ideas. (People who want me to sign an NDA to tell me the simplest idea.) To me, ideas are worth nothing unless executed. They are just a multiplier. Execution is worth millions.

6. Share widely
Share your data (e.g. on ScholarSphere). Share your code (e.g. on GitHub). Share your science (by publishing in open access journals). A Hacking Science community can only thrive as a sharing community. Almost all of the tools we are using are open source. It’s really a no brainer.

(written by Marcel Salathé)

Have you truly just read more than 2,500 words? If you are still here, you should probably follow HS on Twitter, and subscribe to the email list.

Nov 12

BYOD – Bring your own data(problems)

Please come and join us for the first BYOD Hacking Science event on Thursday, November the 29th in the Millennium Science Complex (W-306).

The BYOD event is an opportunity for you to give your fellow Hacking Science community members a window into your data. Highlight your data set. Gripe about a common data problem you run into. Share why working with data is problematic. Demonstrate a cool tool that saves you a bunch of time when working with data. Bring your own data, bring your hacker spirit, and the result will be a better understanding of “data science” at Penn State, which will be used to form an agenda for an upcoming hackathon event. Participants are encouraged to talk about their projects, data, problems and solutions in a rapid presentation not to exceed 3 minutes.

If you want to show slides, or actual data, please bring along your laptop.

Free food will be served.

Please register here: http://byod-hacking-science.eventbrite.com/

Skip to toolbar