Data Science Environment Summit

October 8, 2014

This week, I participated in the Moore-Sloan Data Science Environment Summit, a collaborative meeting between three data science centers at UC Berkeley, NYU, and the University of Washington. By virtue of being a fellow with the Berkeley Institute for Data Science, I was invited to participate in this meeting along with the PI’s, Co-PI’s, Senior Fellows, and other researchers who are part of BIDS, NYU’s center for Data Science, and the University of Washington’s eScience institute.


Our director, Kevin Koy, spearheaded the organization of this first annual meeting and made the wise choice to host it at the Asilomar Conference Grounds near Monterrey, CA.

The Monterrey Bay Aquarium

The first thing on the agenda was the Monterrey Bay Aquarium. I’d always wanted to visit, and it was everything I could have imagined an aquarium might be. There’s an excellent jellyfish exhibit as well as a brand new cephalopod exhibit. The octopuses, squids, and cuttlefish were amazing creatures.

The Conference Grounds

I learned from Professor Henry Brady and his lovely wife that Asilomar was designed by architect Julia Morgan in beautiful arts and crafts style. The grounds are simply lovely and have a fun, if cold, nearby beach.

cold pacific

Bonding and Community Building

In order to solidify the collaborative bonds between the diverse scientists at the geographically disperse campuses involved in the collaboration, we did various bonding and collaboration building activities. While that’s not usually my jam, I did learn a few things.

Dav Clark FTW

The most important thing I learned was to have Dav Clark on your team. If possible, maximize the experience by having an extra David Clark on your team as well. Out of five people, our team had two Dav* Clarks and we were the winners. Coincidence? I think not.

Fail Fast

The other lesson from the bonding activities was straight out of agile workflows: fail fast. We were all separated into teams to make towers with spaghetti and tape, to hold up a marshmallow. We had 18 minutes. By prototyping quickly and racing ahead with the first reasonable-sounding idea that was suggested, we started actually building before a lot of other teams and had more time for quality control. In an analogy to the collaboration, this lesson suggests that rather than spending too much longer getting a plan together, we should just start moving with the ideas that we already have. Ideally, allowing ourselves room to fail fast, we’ll actually get some fast successes along the way.

Of course, it's better to succeed fast


Rather than allow a planning committee to dictate the meeting subject matter, the attendees collaborated on a conference schedule for Monday and Wednesday mornings. It resulted in a set of topics I had trouble choosing between (being only one person, I can only attend one session at a time, after all).

Visualization Session

The first session I attended was the visualization session, where we mentioned and (and cheered for) a few interesting tools.

Demos Session

I ended up (by virtue of being outspoken and often standing within earshot of Kevin Koy) leading the tools session. It was a great series of lightning talks (7 thrilling minutes per person!):

Working Groups

Tuesday, working group sessions took place simultaneously, in many different rooms, and covered most of the scientific computing topics I’ve been interested in for the last five years, including:

Most Influential Work

A fascinating presentation by Mark Stalzer described what the Moore Foundation learned about the most impactful works in data science. Each of the more than 1100 pre-applications for the Data Driven Discovery Investigators program included up to five references that, in the opinion of the author, were some of the most influential in data science. The resulting bibliography:

Research Microblogging

Finally, I learned this morning, that David Hogg, who was an exceptional leader of this event, blogs five times a week about research! Amazing, IMO. The rules he has instituted for himself are :

I must post five days per week (I choose which five), except when traveling, no matter what I have done.


I must write only about research; no committees, no refereeing, no teaching, no excuses.

It seems overly ambitious and questionably helpful, but I guess he’s been doing it for years, so it must be serving him well somehow. In the same way that Titus' blog is an impressive body of work, so too is Hogg’s blog. I’m inspired.

He also has a teaching blog.



comments powered by Disqus


Creative Commons License
This work by Katy Huff is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at