10 reasons scientists should use version control

  1. You don’t have to worry about breaking your code. 
  2. Back up your work. Your server isn’t as secure as you think it is and it’s too easy to accidentally overwrite files on Linux.
  3. Permits safe tidying of unused code. No need to comment things out “in case they’ll be useful later.” This will help keep your code clean and readable. Marie Kondo would be proud.
  4. Easier searching through older code. You will re-use code often; unless you slay at file management, you’ll waste time searching for that elusive function. Also, if you use an online repository, you can have access to your files anywhere.
  5. Provides encouragement to write good comments and include a README file. While it doesn’t FORCE you to write good comments, it’s a low investment high return activity. Also, write and update that readme file – you might know how to run your intricate code today, but Future You won’t. Sidenote: What makes a good commit?
  6. Encourages you to step back and think about the big picture. Pausing to commit changes forces you to step back and summarize your work as well as plan the next step.
  7. Transparency! Share code with others using an online repository, especially if you followed reason 5.
  8. It’s free.
  9. Creates an online portfolio of your work. Portfolios aren’t just for artists anymore – online repositories shows off your skills and version control is increasingly being used outside of CS. It’s a great addition to your resume!
  10. It just feels good. When your work is largely invisible, looking at all your projects and repositories can give you a sense of accomplishment.

If you’re convinced, I wrote a highly general overview for this highly useful tool!

Advertisements

Earth Science at NASA vs. NOAA

Having worked for both NASA to NOAA, I wanted to compare what is same and what is different. For the most part, my day-to-day work is pretty similar. In general, there is a great deal of overlap and collaboration between the two agencies. Many people have moved between NASA and NOAA and there are always rumors going around about branches being merged into one or the other. That being said, there are also many differences in the goals and management of each.

I will preface this post by saying that neither is “better” or “worse,” because ultimately the agencies carry out their goals successfully. Also, here is the usual disclaimer of these are my opinions as a human and not representing the agencies which contract me.

To start, the two agencies have (shock!) different missions: NASA’s mission is more research oriented, science-for-the-sake-of-science whereas NOAA’s mission is focused on providing data necessary for public safety and commerce (mainly, accurate weather forecasts). (UPDATE 6/26/2018 : And if you’re thinking that mission statements don’t matter, NOAA’s made the news recently, although I think this is a lot of noise).

So here are some of my observations:

  • NASA’s Earth Science has an FY18 budget of about $1.75 billion (out of $19.1 billion total), while NOAA’s total budget is $4.78 billion.
  • NOAA satellites tend to have many instruments on-board to collect a variety of measurements while NASA satellites may have one to two flagship instruments to do one thing really well.
  • NASA Earth Science Missions focus on data continuity and consistency, while NOAA invests heavily in data availability – you don’t want satellite data to be unavailable during a tornado warning.
  • Going back to the science meeting question: NASA datasets are typically released only after careful internal review and they will re-process previous data to the new version if later changes are made for consistency. NOAA, on the other hand, will release the datasets earlier to users with the following disclaimer:

These GOES-16 data are preliminary, non-operational data and are undergoing testing. Users bear all responsibility for inspecting the data prior to use and for the manner in which the data are utilized.

With great data comes great responsibility. I discussed NOAA data maturity in an earlier post.

  • NASA doesn’t have as much “always on” pressure. Neither agency has infinite resources, the sacrifice NOAA makes is the processing and archiving of past data. Thus, once a product is operational, very few, if any, changes will be made.

More humorously:

  • NOAA makes acronyms from words. For instance, NOAA has a VPPPGB; that is, a Verification, Post-processing and Product Generation Branch. Meanwhile, NASA prefers to force words into acronyms, like they did for the Ice, Clouds, and Land Elevation Satellite (ICESat) mission.
  • I still don’t get how NOAA is organized: There are Centers within centers, and then divisions, and then branches, and then groups, but my group has members in different branches. NASA’s has a “code” structure where each lab gets a number. It’s very consistent although I have to sometimes look up what the numbers are. Also, many of the “labs” do not have a lab at all.
  • NASA folks can get Apple computers. NOAA scientists all have big clunky Dell computers.
  • People dress more professionally at NOAA. At NASA, I regularly saw people in birkenstocks, socks, cargo shorts, and tie dye T-shirts.
  •  NASA has much cooler swag and brand recognition. I liked telling people that I work at NASA. Meanwhile, NOAA’s gift shop was shut down. When I tell people that I work for NOAA, they say “huh?” and I try to change the topic.

In summary, when combined, NASA and NOAA are much like a mullet: business on top, party in the back.

Primer on NOAA Data Maturity

A wealth of earth-observing satellite data is freely available on the internet, such as through NASA’s Mirador portal or through NOAA’s CLASS. While I still occasionally see folks distributing data via (cringe) hard drives, strides have been made to standardize the way the data can be found online, downloaded, and structured, such as through use of standardized formats like HDF (NASA) and NetCDF (NOAA).

GOES-R (now, GOES-16) is a state-of-the-art satellite that launched in late 2016. The ABI instrument is what bluray was the DVD quality. Data are now available, but many products are still under evaluation, with the following maturity levels:

  • Beta: the product is minimally validated and may still contain significant errors; based on product quick looks using the initial calibration parameters.
  • Provisional: product performance has been demonstrated through a large, but still (seasonally or otherwise) limited, number of independent measurements. The analysis is sufficient for limited qualitative determinations of product fitness-for-purpose, and the product is potentially ready for testing operational use.
  • Full: product performance has been demonstrated over a large and wide range of representative conditions, with comprehensive documentation of product performance, including known anomalies and their remediation strategies. Products are ready for operational use.

Specific criteria must be met for a product to “graduate” from one level to the next. For numeric quantities, this usually takes the form of a statistical comparisons with surface measurements or established satellite observations. Metrics such as the accuracy, precision, root mean square errors, etc. must meet a pre-determined level before advancing to a new maturity level. For categorical or binary detection, such as with cloud or smoke detection, the frequency of false positives, false negatives, true positives and true negatives are evaluated. With each maturity level, the criteria become stricter although the dataset never will be “perfect.” Some degree of uncertainty will inevitably be present.

The different GOES-16 ABI data products are also interconnected. All the downstream level 2 products depend on Level 1 radiance data, so the Level 1 products reached maturity first. The many Level 2 products use other Level 2 products as an input; thus more time is given to these products. Data in the beta stage can only be accessed after a user makes a requests, but provisional and fully validated is available to the public.

Go to the NOAA CLASS website to get the most up-to-date information on product maturity and availability. To learn more about the maturity criteria, you can access the RIMPs for all GOES-16 products … although a strong beverage may be in order before reading.

Book Review: Clean Code

In the age of satellites and large climate models, Earth Scientists spend much of their time developing custom code and scripts to analyze them. I have also many times been handed legacy code to improve upon, only to find that it just as time consuming to update the code than it was to write from scratch.

So what are the best programming practices? According to CS academic Ian F. Sommerville, attributes of good software include:

  • Maintainability: Software must evolve as user needs change
  • Dependability: bug free, consistent results.
  • Efficiency: use computing resources wisely
  • Acceptability: it does what it’s supposed to, according to the user

The priority order will change depending on the project. For someone working with smaller data sets, efficiency will be at the bottom of your list; speeding up your program to run in 30 seconds instead of 45 seconds is neat but isn’t usually worth the time. A throwaway program to do a one-off task doesn’t need to be reusable or dependable; it just has to get the job done correctly.

However, I’ve discovered that most of the time code will be re-used again and again, especially when working in a research team. Thus, maintainability is at the top of my priority list. That’s where clean coding practices come in.

In Clean Code: A Handbook of Agile Software Craftsmanship, Robert “Uncle Bob” Martin goes through his philosophy of what it means to write maintainable code. Some vocabulary that he uses to describe his vision of clean code includes elegant, readable, focused, literate, efficient, simple, pleasing… essentially, write code so readable that it flows like a prose. The rest of the book is dedicated to how to do achieve this.

Read More »

Want to storm chase from your desk?

I recently had a publication accepted into the Journal of Geophysical Research – Atmospheres – In my paper, I discuss how individual clouds “clusters” evolve (grow, mature, and decay) over their lifetimes on global scales, thanks to long records of next-generation geostationary satellites. I found that cloud maturity was a longer, more delayed process in longer lasting storms. Over land, there tended to be a single daily maximum in clouds and storm development in the afternoon. In contrast, over the ocean, there are two peaks: one in the afternoon and also one in the early morning. Early morning clouds tended to last longer (> 6 hours) than those in the afternoon. My paper provide a big picture survey of the life cycle evolution of cloud characteristics as seen from infrared satellites, which can complement climate model simulations and enhance satellite retrieval algorithms.

The paper is online now on JGR’s website, although it’s behind a paywall until 2017. In the meantime, you can read the “comic book” edition below with a surprise morbid ending.

img_1171
Comic edition – with surprise morbid ending.

Women in Science

I had an opportunity to have lunch with the speaker of the distinguished female faculty lecture with several other graduate students on the topic of women in science. Below were some of the interesting discussion points that our speaker shared with the group:

  • Finding your voice as a female graduate student. A student shared that she received some well intended advice that she needed to speak more confidently during presentations – however, she actually felt quite confident speaking so it came across as “stop talking like a woman and talk more like a man.” The speaker shared that being professional doesn’t mean you need to speak and act in a traditionally masculine way. It’s hard to say what is truly related to confidence and what expectations are result from implicit bias. It’s important to point out this isn’t a “male” or “female” issue – anyone who appears to lack “masculinity” in their speaking style can be perceived as “less effective.” In short, to continue to hone your presentation skills and style, but success does not require that you “talk like a man.”
  • Being comfortable in your own skin. Continuing from the last point, the idea of being a professional woman doesn’t mean you have to dress and act like a man. It’s important to be comfortable in your own skin and pick an environment that allows you to do so.
  • Bringing up the topic of work-life balance. Work life balance is of particular importance to males and females who are interested in starting a family when picking their research or post-doc advisor. How to you bring such a topic up? Rather than talking directly about the issue (e.g. nobody really should ask if you’re trying to have a kid at a job interview), rather one can bring up the issues of family time or by asking other members of the group about their experiences or others with people around them.
  • Reporting sexual harassment. There has been a lot of press on this topic lately. It’s also important to identify a male or female faculty member to whom you can discuss such an issue if it were to arise. However, it might be too intimidating to talk to a faculty member, so establishing a peer-to-peer support network at your university could be another avenue. This allows students who want to bring up issues to have a safe place to discuss what happened and learn about their resources.
  • Being interdisciplinary in practice, not just in name. Some universities have coursework where students from different disciplines are placed in a group to find solutions to problems, often yielding some fascinating research that has led to publications!

If these issues interest you, I highly recommend Women in Science: Meeting Career Challenges , a collection of essays edited by by Angela M. Pattatucci.