Open Source 101 – Building Community in Raleigh, NC

On Saturday February 4th, I was fortunate enough to get the chance to spend my day attending and speaking at Open Source 101 (http://opensource101.com/) in Raleigh, NC. Open Source 101 is a one day conference for developers who are new to Open Source and is run by the awesome team behind All Things Open (https://allthingsopen.org/).  The focus of the conference is to help attendees learn how to participate and benefit from Open Source software. Looking around the room during the opening keynote talks, it was hard not to be impressed by the over 500 attendees who had chosen to spend their entire Saturday at this inaugural event. I was honored to be included as a speaker alongside such a highly respected group of speakers.

My presentation, cheekily entitled Create your first open source project the “right way”, todo_list
was designed to provoke conversation around the minimal things that developers can (and should) do when open sourcing their code. Although I have spoken many times about different hard skills in software development, this is the first time I have tackled the softer skills related to building community around Open Source software and I relished (and was honestly a bit terrified of) the challenge.

Having worked in and around Open Source for the last seven years, I have developed some thoughts on what it takes to create a good Open Source project (beyond simply writing solid code). For the presentation I boiled these thoughts down to the following eight ideas or recommendations:

  • Pick the right license for your project
  • Pick the right host for your project (Github, Bitbucket, Codeplex, etc.) that has the tools you need to build a community around your project
  • Write good, solid documentation that explains clearly the two most important things about your code: what it is and how to use it
  • Keep living documentation of your code through issues whether it be bug reports, ideas for enhancements, questions, or requests for help
  • Write unit tests for your code (striving to maximize coverage) and keep the unit tests current as you write new code
  • “Release early and release often” – even if all you are releasing is a script, package your code and documentation up, put a version number on it and release it
  • If you want people to use your code, build community around your code, or just learn something from feedback, then ask for help
  • And of course, be gracious in accepting help when it is offered

Of course, it takes more than simply following these eight guidelines to ensure a project is a “success”, however neglecting too many of them nearly guarantees failure.

When I first got word from the conference organizers that my presentation had been accepted, I immediately set to work on filling in the blanks of the abstract I had submitted. Not long into a first draft of an outline, I started thinking about the slides and how I would convey my story visually. And then, I had a brilliant idea (one I freely admit to stealing) — I would hand draw my slides! Over the past several years I have seen a handful of presenters who used hand drawn slides and each time, I appreciated their effort and remembered it more than the standard “corporate” slide deck.

“How hard could it be to hand draw twenty slides?” I thought. It turns out that it was actually pretty hard. The biggest struggle I experienced was simply not trying to make each slide a “work of art.”

In the end, I gave my presentation to a friendly and engaged crowd. There were good bugsquestions about the material and some really nice complements on the slides. One woman even told me that my “bug” was now the background image on her phone. Hopefully everyone left with some new ideas on how to make their future Open Source efforts more fruitful.

If you are looking for Open Source conferences to attend or speak at, I can’t recommend highly enough All Things Open and Open Source 101. Todd Lewis (https://twitter.com/toddlew) and his team do an amazing job at putting together events that have an embarrassing richness of content in a welcoming and inclusive environment.

Note: My presentation Create your first open source project the “right way” is Open Source material released under the Apache 2.0 License. You can find the full presentation on Github at: https://github.com/cvitter/Open-Source-101-Doing-It-Right.

 

Please reach out via twitter with any feedback.
Craig Vitter
@craigvitter

Easy Time Series Analysis With NoSQL, Python, Pandas and Jupyter

I was really honored to speak at All Things Open (allthingsopen.org) this year. All Things Open is an absolutely amazing gathering of over 2400 open source practitioners in Raleigh, NC (which just happens to be blessed with amazing barbecue and a really nice conference center). This year’s conference was packed with high quality presentations, great attendees, and some awesome social events including the conference ending soiree at the Boxcar Bar and Arcade (theboxcarbar.com/raleigh/). Of course I was also happily surprised at the large number of people who turned out for my presentation: “Easy Time Series Analysis With NoSQL, Python, Pandas & Jupyter”. As it turns out I had a packed room for my talk about putting together a cheap (free!) and cheerful set of tools to do time series analysis.

Up until just recently, doing time series analysis at scale was expensive and almost exclusively the domain of large enterprises. What made time series a hard/expensive problem to tackle? Until the advent of NoSQL databases, scaling up to meet increasing velocity and volumes of data generally meant scaling hardware vertically by adding CPUs, memory, or additional hard drives. When combined with database licensing models that charged per processor core the cost of scaling was simply out of reach for most.

Fortunately the open source community is democratising large scale data analysis rapidly and I am lucky enough to work at Basho which is making contributions in this space. In my talk I introduced the audience to Basho’s open source time series database Riak TS (http://docs.basho.com/riak/ts/) and demonstrated how to use it in conjunction with three other open source tools (Python, Pandas, and Jupyter) to build a completely open source time series analysis platform in next to no time at all.

I think that Riak TS is a particularly exciting addition to the open source world of databases for a couple of reasons. To start, you would be hard pressed to find a time series database that can scale from one to over one hundred nodes on commodity hardware with so little effort in the ops department. Riak TS automatically handles the distribution of data around your cluster of nodes, replicates your data three times to ensure high availability, and has a host of other automated features that are designed specifically to maximize uptime while making it easy to grow your cluster to meet your scaling needs.

Developing applications on top of Riak TS is just as easy (whether you work with Java, Python, Ruby, GO, Node.js, PHP, .Net, or Erlang) as installing and running the database. One of the coolest features for developers is Riak TS’s use of ANSI compliant SQL. While SQL may not be the coolest, latest thing in the world of big data it certainly makes Riak TS accessible to a wide range of developers and, maybe even more importantly, business/data analysts.

My talk started off with an introduction to Riak TS, a key-value database optimized to store and retrieve time series data while being able to scale to meet truly massive data sets. During the “academic” portion of the talk I covered the architecture of Riak TS, its feature set, and some of the unique things that set it apart from other time series databases currently available. I also covered some example Riak TS use cases and how that the use case affects the way that you go about modeling data.

In the “practical” portion of my talk we covered the of getting started with Riak TS:

  • Installation – where to get Riak TS, how to install it, and how to scale it up as the size of your data problem grows;
  • How to get started interacting with Riak TS using the built in riak-shell and Python using the Riak Python Client;
  • How to create a new table in Riak TS and verify that it was created;
  • And how to query Riak TS using both the riak-shell and Python;

During the practical portion of the walk through we also loaded over three hundred and fifty-thousand records from the Bay Area Bike Share open data set (http://www.bayareabikeshare.com/open-data) to demonstrate how fast Riak TS is at both reading and writing data.

Having mastered the basics of using Riak TS we moved on to the “advanced” portion of talk where we introduced the Python Data Analysis Library and Jupyter (these two open source tools should be staples of any Python programmers chest of data analysis tools). After a brief introduction to Pandas and Jupyter we ran through some data analysis examples where we demonstrated the kind of insight we can gain using the tools and the Bay Area Bike Share data we loaded earlier on. We also covered how to use Python within Jupyter to:

  • Query Riak TS;
  • Convert a Riak TS resultset into a Pandas DataFrame;
  • Demonstrate some of the built in data analysis features of Pandas;
  • And finally we used the matplotlib library to demonstrate how to create data visualizations.

if you are feeling particularly motivated to start analyzing time series data you can grab all of my example code (which is open source of course) from the following repository on GitHub: https://github.com/cvitter/ATO2016.

Note: An early version of this blog post appeared on opensource.com before All Things Open: https://opensource.com/life/16/9/time-series-analysis-riak-ts.

NoSQL Riak TS Gets JDBC Driver Inspired by SQL

When Basho’s engineering team released Riak TS 1.0 back in December 2015 one of the features that I found most exciting was its use of standard SQL. I know that there aren’t a lot of people who get excited by SQL in this era of NoSQL databases but SQL isn’t dead just yet. In the 30+ years that SQL has been in use, it has had the opportunity to find itself integrated into the vast majority of databases and reporting tools used by enterprises. Essentially SQL has become the lingua franca of data analysis and by making SQL the query language of Riak TS, Basho made the database accessible to a wider range of potential users.

As cool as that is, as a developer, I realized that the use of SQL also made it possible to build a JDBC (Java Database Connectivity) driver for Riak TS. If you aren’t already familiar with the JDBC API, it provides Java applications standardized methods to connect to, query, and update data in any database (almost exclusively relational databases) that provides a JDBC driver. As an official part of the Java language since 1997, JDBC has been widely adopted by developers. If you use a reporting tool like those available from Cognos, Microstrategy, Business Objects, or Jaspersoft, than you can connect to any data source that provides a JDBC driver.

Once I realized how important a JDBC driver would be for Riak TS, I was compelled to write one. When I started down the path of writing a JDBC driver for Riak TS my goal was simply to use it as a learning opportunity, I wasn’t really convinced that I would have the time or ability to produce something that would be generally useful. As I started working on the driver the learning exercise became a viable project and so now I’ve decided to open source the project and share my work with the community:

https://github.com/basho-labs/Riak-TS-JDBC-Driver

There are two main reasons why you would want to use the JDBC Driver:

  1. You are a Java application developer familiar with the JDBC API and want to integrate Riak TS into an application;
  2. You use reporting tools like BusinessObjects, Cognos, or Jaspersoft that allow you to connect to databases using JDBC drivers.

If you have one of the proceeding uses for a JDBC driver for Riak TS check out the ReadMe at https://github.com/basho-labs/Riak-TS-JDBC-Driver/tree/master/riakts.jdbc.driver for full details on the driver’s capabilities and how to get started using it. And of course if you do use the driver please leave feedback, submit issues, or submit pull requests.

Hiking the Appalachian Trail – Lessons Learned

The view is worth it

The view is worth it

I have had a couple of days to process my first attempt at hiking the Appalachian Trail and figured it would be a good idea to make some notes on the lessons I learned from the experience… by the way, most of these lessons learned are not earth shattering revelations. Truth be told almost everyone of the lessons could have been avoided if I had simply followed the advice I had already gleaned from books and articles on the subject of hiking. Experience is the best teacher though…

Water

I ran short of water with a few miles left in my hike because I decided not to take the 0.2 mile side trip to the spring near Compton Gap when I was a little over 5 miles from the end of my hike. There is plenty of advice related to water out there. The best advice is to not pass up the chance to top off your supply. Getting dehydrated is not good.

Another lesson learned is that I should carry something that I can use as a filter for my water bottles when attempting to fill them. Although I use Aquamira drops to make the water safe for drinking the chemicals don’t remove “floaties” from the water.

Sleeping

I am still new to the world of hammock camping so each time I hang my hammock from a tree and climb in is a learning experience. Setting out for the AT I knew that my first night on the trail the temperature was supposed to dip down into the low 50’s and that I would be cold. I figured that wearing (and carrying in my pack) a pair of heavy fleece pants and a fleece jacket would be good enough inside my 55 degree sleeping back to keep me warm. I was wrong. Although I was never in any real danger I did end up spending much of the evening curled up in the fetal position trying to stay warm.

Why was I cold in my hammock? First off, I set up my hammock on the windward side of the mountain instead of the lee side (generally not wise) and the wind was blowing pretty good all night long. Second, I ignored the advice to use a pad and/or under quilt with my hammock for insulation. When the air temperature is 20 degrees warmer you can do without the extra insulation. When its 50 degrees going without is asking for a long, cold night.

Food and Consumables

I over packed food and consumables for the trip I ended up taking. Part of this was unavoidable due to cutting a full night off of the trip but most of it was just due to being unsure of what to expect. Here is a list of items that I carried that went unused:

  • Cook set – alcohol stove, pot, coffee mug
  • Fuel – 8 oz of heet
  • Coffee packets (decided to break camp and hit the trail without having any and don’t regret the decision at all surprisingly)
  • 2 packets of Ramen noodles
  • 2 packets of dried fruit
  • 1 Cliff bar
  • 1 Milky Way bar

If I had stayed with the original plan to sleep out two nights I would have used the stove to cook some Ramen noodles and I might have also brewed coffee in the morning. It is also possible that I would have consumed more of the other food items during the course of the trip. The real issue wasn’t so much that I was carrying too much (although I still have some experience to gain in planning) it was that my trip plan changed mid-trip.

Mileage

On the first day of my hike I got cocky and pretty much threw my trip plan out the window. I planned to make my North Bound 27.4 hike in three days but when my average MPH was much higher on day one then I had expected it would I started thinking that the hike was going to be easier than I had thought… so instead of walking 9 or 10 miles on day one I did 13. On day two I did around 15 miles and was in a lot of pain during the course of the day.

The lesson I took away from this was don’t get cocky, don’t over do it, and most importantly I think (since I am unlikely to remember the first two lessons for long) is that I should be doing more training hikes to prepare for the rigors of hiking and I should make a plan and stick to it…

Technology

Technology is great but not something you should rely on when you are hiking in the wilderness (shocking I know). I brought my iPhone and a battery pack to charge it with and found that reception was basically non-existent for 90%+ of the 27.4 miles of my route. For the most part this was ok. I didn’t really wish to receive phone calls or emails when I was hiking. I had hoped however to post a few updates a day along the route to keep family in the loop with my progress and that turned out not to be possible. On day one I managed to get one update via Facebook at mile 5.8 into the hike. After that I couldn’t get strong enough reception until I was more than 22 miles into my hike and approaching Front Royal again.

On the plus side I was able to put my iPhone into plane mode and keep using it as a camera and take advantage of a pre-loaded map of the AT through Shenandoah National Park during the hike.

Pack

I have a GoLite Jam 70L Pack (as well as the 35L version) and this was my first real long distance hike with it. Overall the pack performed well but there were a few minor things that caused me problems.

First, I think that I should have gotten my pack in medium instead of large. I struggled along the way to keep the pack sitting properly on my hips and bearing the load. All this and I had the straps cinched down most of the way. Second, the cinch straps on the side of the pack can interfere with getting your water bottles in and out of the side pockets.

Overall these were minor issues that I think I can resolve with a bit more experience.

Conclusions

I survived the hike, yay! One of the most valuable lessons I learned from this hike was just that I could do it. I was a little nervous about attempting my first big hike on the AT and completing it goes a long way towards giving me the confidence to do it again.

Other than that everything all of the issues I ran into were very minor and surmountable with more practice and miles on the trail.

Hiking the Appalachian Trail – Day 2

Holy crap it was cold over night. I knew it was going to be cold but I figured I was prepared for it… sort of. I woke up several times in the night curled up in the fetal position inside my sleeping bag. Otherwise I slept great!

Gravel Spring

Gravel Spring

I was up and out of the hammock and had camp broken by a few minutes after 7. One of my first tasks for the day was to top off my water at Gravel Springs an easy 1.4 mile hike to start the day off… plus 0.2 miles down a mountain goat path (and 0.2 miles back up that same mountain goat path of course). On the way down to the spring I started feeling a hot spot on my heel. After 14 miles of blister free hiking (with proper prep) I had developed a blister. Bummer…

After filling topping off my water bottles I applied a little moleskin and tape to keep it in place. With the morning chores out of the way it was time to climb back up to the AT and continue on my way.

Up, Up, and Up Some More

I started the day feeling very, very positive about making it back to my Jeep by the end of the day. If I had paid more attention to the topography along my route I would have been much less positive.

From Gravel Spring my hike took me up to the summits of South and North Marshall for a total of 2.3 miles of mostly uphill slog. After about 4 miles of hiking I was really feeling my legs and all of the abuse they had been taking going up and down mountains. Of course every time I started to question the whole premise of hiking the trail I would come across a view like this…

The view from South Marshall

The view from South Marshall

Going Down?

Where am I?

Where am I?

My legs were screaming when I reached the intersection of the AT and Dickie Ridge trails just past Compton Gap after 9.7 miles or so of hiking. Not having carefully studied the topography of my hike I mistakenly assumed that the final miles where going to downhill… oh, how wrong I was.

To be fair, the first few miles of the Dickie Ridge trail were relatively benign, almost pleasant. Almost. The sights were certainly pleasant enough but the pain in my legs was working hard to distract me from the scenery. I was in the final phase of my hike and words “death march” where starting to leak into my line of thought.

Its called Low Gap for a reason...

Its called Low Gap for a reason…

Three miles from my car (crossing the drive at Low Gap) is where I really began to doubt the sanity of pushing through a 15+ mile day of hiking. With a name like Low Gap… crossing the drive and looking up the trail I realized that I was in for yet more uphill hiking, nearly two miles worth of uphill as it turns out…

On the bright side I was moving so slowly that I had plenty of opportunity to enjoy the scenery as I walked up and over Dickie Range.

Please Don’t Eat me Mr. Bear

I don’t know how fast I was walking at this point but it felt like a fraction of a mile per hour. My legs were stiffening up to a point where my walk was really more of a slow painful shuffle. At one point while I was crawling up Dickie Ridge there was a commotion in the bushes to my right. When I turned expecting to see a deer I was a black bear shimmying up a tree between 20 to 30 feet away from me. To say that I was alarmed would be an understatement. Fortunately it seemed like the bear was just alarmed as I was and i was able to keep shuffling up the trail without needing to use my bear spray.

Its Finally Over…

Dickie Ridge Visitor Center

Dickie Ridge Visitor Center

I wish I could say that I strode into the Dickie Ridge Visitor Center but truth be told it was move of a limp. It felt amazing to collapse onto a bench though and stare out at the amazing view knowing that I survived my first big hike on the AT and learned some pretty valuable lessons along the way. Hopefully my next trip will end with slightly less pain!

Hiking the Appalachian Trail – Day 1

After months of planning I finally got up the nerve to take my first steps along the Appalachian Trail. The plan was to tackle the trail northbound starting at Thornton Gap and ending at the Dickie Ridge Visitor Center hiking the roughly 27.4 miles in three days/two days. Monday morning I caravanned to Shenandoah National Park with my brother and sister-in-law. I left my Jeep at the visitor center and then hitched a ride with my brother down to the Panaroma comfort station at Thornton Gap.

The First Steps are the Hardest…

Concrete Spring Post

Concrete Spring Post 3.5 miles in to my hike

I admit being a bit nervous as I took my first steps onto the AT. Prior to to starting out I had never hiked more than 8 miles in one day much less attempted a long of night hike. As I started out I had a little silly thoughts like whether or not I was going the right way and whether or not I was going to make it back to my Jeep 27.4 miles away.

Starting at any place with Gap in its name should clue you into the fact that a climb was in my near future. Start at 2,307 feet the two mile climb to the summit of pass mountain takes you to 3,052 feet. It was pretty cold Monday morning but 10 minutes into the hike I had to strip off my jacket.

The intersection of the Appalachian Trail and Thornton River trails 5.8 miles into my hike

The intersection of the Appalachian Trail and Thornton River trails 5.8 miles into my hike

Making Good Time

The first two hours of my hike were blissfully quiet and I didn’t run into another human until Beahms Gap Overlook (3.4 miles into the hike) where the trail crosses over Skyline Drive before heading back into the woods. Not far beyond that I ran into my first spring on the trail (see picture above).  At 12:30 I reached the intersection of the Thornton River trail 5.8 miles into my hike (that’s 5.8 miles in 2 hours and 40 minutes of hiking, much faster than I had figured on doing). At this stage of my hike I was feeling very positive about my progress and was really enjoying the hike.

Eating at the Wallow

Elkwallow Wayside in the rain

Elkwallow Wayside in the rain

The next major milestone on my itinerary was the Elkwallow Wayside 8.5 miles down the trail where I planned on getting a cheeseburger and french fries at the grill. The 2.7 mile hike to Elkwallow was mostly pleasant up until the last half mile or so climbing up to the Wayside. After 8 miles on the trail my legs were starting to get a little tired, my stomach was starting to rumble, and a cool drizzle began to fall.

Walking into the wayside was a bit of a shock after being almost completely alone on the trail. The wayside was pretty chock full of people buying lunch, souvenirs, and generally getting out of the rain. After getting my cheeseburger, french fries, and soda, I found a spot out of the rain where I could sit and eat in relative comfort (as comfortable as sitting on concrete on the ground can be). There were a couple of through hikers sitting near me who shared some trail stories while we waited out the rain.

Finishing off the Day

After hanging out for about 50 minutes it was pretty clear that the rain might keep going indefinitely and I had to hike at least one or two more miles on to reach a suitable place to camp for the night so it was time to head out again. Unfortunately, as it turns out, the first two miles or so were a steady climb up to Sugarloaf and Hogback (a gain of about a 1000 feet in elevation although it seemed worse at the time).

View from Little Hogback Overlook

View from Little Hogback Overlook

After a good 11 miles or more on the trail I was getting pretty tired but some little piece of my brain nagged me to keep going on since it was relatively early in the day and I was bound to find the perfect place to hang my hammock just around the bend…

I finally pulled up for the night after roughly 13.1 miles on the trail between the intersection of the Keyser Run Fire Road (13 miles) and Skyline Drive (13.2 miles) and ventured off trail to make camp.

My home for the night

My home for the night

Read about day two here: Hiking the Appalachian Trail – Day 2

Camping in the Chopawamsic Backcounty Area

Monday afternoon I ventured out to the Chopawamsic Backcounty Area of Prince William Forest National Park to break in my lightweight camping gear and Northwoods hammock. The Chopawamsic Backcounty Area is 1500 acres of backcountry with 8 camping sites. To access the backcountry you need a camping permit and key that you can obtain at the park visitor center for free (you still need to pay for your park entrance). The backcountry is separate from the park proper (find it on Google Maps). To get there you leave the park and drive a few miles west on Joplin Road then turn on to a dirt road and drive another mile or so until you reach the gate that takes you to the backcountry parking (which unlocks with the key you get at the visitor center). Access to the 8 camping sites is from a two mile loop that starts and ends in a small parking lot with an ancient port-a-potty.

Site 3

When you check in at the visitor center they will ask you which one of the 8 sites you plan on camping at and put that down on your permit so you should read the descriptions of the sites before you or ask the staff which they recommend. I originally picked site number 2 because the description suggested a great view. When I hiked up to site 2 (and I mean up, the side trail to site 2 is pretty steep) I discovered that there wasn’t really much of a view and the trees weren’t really ideal for hanging a hammock so I headed back on down the hill and on to site number 3 which was pretty much perfect. Since I was the only person in the backcountry that evening I was able to call the visitor center and let them know I was changing sites (important information in case of emergency so they know where to find your body).

My home away from home

My home away from home

This was my first time spending the night in a hammock so I took my time setting things up and playing around with the rigging. I paid extra attention to the tarp because the the weather forecast was telling that there was a 65% chance of a thunderstorm in the evening so I wanted to be sure I had some place to hunker down. Fortunately, although it was very humid, the thunderstorms missed me.

Sleeping in the hammock went pretty much about as well as I would have expected for a first time. There were definitely some lessons learned. For example, it turns out the my sleeping back zips on the left side while the hammock zips on the right side. This slightly complicates the process of getting in and out. Next time I buy a sleeping bag…

Otherwise I slept fairly well. I woke up at one point in the middle of the night with a hyperextended knee. I’m not sure how to avoid that happening since I tend to move around a bit in my sleep. I also discovered that a pillow would make a great addition to my kit. Physically I felt great when I got out of the hammock in the morning. Better actually than most mornings when I roll out of bed. I could definitely get used to sleeping in a hammock.

Overall I am really happy with the outcome of this trip. My goal was to try out a whole bunch of new hiking/camping gear that I had never used before and build the confidence I need to take it out on to the Appalachian Trail. On top of a successful test run I got to enjoy a night in the woods without having to travel far from home. The Chopawamsic Backcounty Area is beautiful and well worth a visit.

Chopawamsic Backcounty Area

Chopawamsic Backcounty Area

Here is a link to all 14 images I shot on Flickr: http://www.flickr.com/photos/craigvitter/sets/72157633568100134/