Blog

About This “Blog”

My experience with blogging has been through many seasons over the years.

I posted several times per week at TheRieslands.com when I was younger and had (many) fewer responsibilities.

I once used this domain as a sort of giant notepad document with everything I was interested in or curious about.

These days, I find myself in new seasons of life.

My work involves daily ingest of data that requires my continual attention.

And when I can make the time, I’m trying to push my running goals farther than ever. Last year, I finished my first 50-miler and got well over my 1,000 mile goal for the year. In 2016… who knows?

 

 

So, all that to say, my posting here won’t be very regular, but I’m not quite ready to give up on blogging just yet.

Cool?

OK…

National Whitewater Center 50-Miler (#WC50)

National Whitewater Center 50

Yep. That sign says “finish” and the beginning of a sunset in the background betrays that it literally took me all day, but I did it: I finished my first 50-miler.

After months of researching my options for my first 50, I finally decided on the National Whitewater Center‘s 3rd annual race, held in October. It was a fantastic race: gorgeous trail, wonderful support folks, and the 13-mile loop provided 3 nice pit stops for changing clothes, restocking snacks, etc.

And 6,400 feet of elevation change was enough to offer a serious challenge, without totally demoralizing newbies like me.

wc501The day got off to an EARLY start.

Like most ultras, the race began at the brisk hour of 5:00, which meant that most of us were out there on 4-5 hours of sleep.

I snapped this shot while waiting for my ride in the lobby of my hotel.

That was really my only potential complaint about the race: The closest hotels to the starting line were all 5-10 miles away, with not shuttles. That meant that I either had to find a ride, or ask my wife to wake up WAY before dawn to drop me off.

Thankfully, my racing buddy was sleeping across the street and happy to give me a ride.

50-mile start

That’s one of my favorite things about ultra running: you meet some awesome folks and, sometimes, instant friends.

This is Chris. I met him at a 50K race that we both ran in the Spring. We traded contact info and did a couple of training runs together, and then attacked this bad boy as a team. Chris has been running for about 2 years. He went from 0 to 50 miles in a hurry!

As far as major takeaways, here are a few thoughts in retrospect:

50 miles is long enough to be kind of boring.

I almost never get bored while running, but I think I finally found the line where I start thinking “OK. I’m just ready to be done.”

I was really grateful for a well-supported race.

There was plenty of food and fluids every few miles, as well as smiling faces who always asked “Do you have everything you need?” I feel like I got a mental boost from this. “OK. I have everything I need. So just keep moving forward.” And I also wanted to keep going out of respect for all the volunteers who were there to help me accomplish this goal.

FOOD.

Usually, if I’m running anything less than about 25 miles, I don’t eat anything while running. Since I don’t have a large intestine, eating while running usually means that I’ll need to make a pit stop soon after. But 50 miles is more than long enough to burn through all the ready fuel in the body, plus most of the complex carbs that haven’t been queued up yet, and then lots of fat. And burning fat for long periods of time feels awful. It’s like mud in the gas tank.

So bottom line: I knew I had to eat. I decided to wait until the end of the first lap and the choose carefully. I wanted to pick up 250-400 calories per lap, and focus on things that would sit well and not upset my stomach. Towards the end of the race, when my hunger was really powerful, I ate more like 500 calories per lap.

ultra marathon food

I found that baked potatoes (and salt!) and banana mango coconut energy pouches – made by the Cliff Bar folks – made a great combination for me. The former provided dense, slow-release energy. And the latter provided natural sugar. And both provided lots of potassium, which is CRUCIAL on a long race if you don’t want to be plagued by muscle cramps.

I also ate 3 or 4 bananas on the day before the race, in hopes of building up that potassium store.

I liked running (big) laps.

A lot of folks say that they would rather not run a lap race. If the laps were anything less than about 10 miles, I think I would agree. But I really like the setup of 4 long laps. Logistically, it meant 3 solid pit stops (I changed my shirt and shoes over the course of the race and also shed my headlamp, hat, and gloves).

But I think it also helped my mental race. After 3 laps, I was really feeling the miles, but looking at “1 more lap” felt like an achievable goal. As I left the smell of the food, the noise of the cheering on-lookers, and the comfort of other people around me, and entered those lonely woods for the fourth time, I knew what to expect. I knew when I was 10 miles from being done. Then 5. And that helped me to push. If, at 40 miles, I was unsure about how far I was from the finish, and what kind of terrain was between me and rest, I might have had a harder time.

I think I can go farther.

Crossing that finish line felt great. I was really looking forward to 3 things: a cold beer, a hug from my wife, and a chair.

But I honestly didn’t feel much different at mile 50 than I did at mile 45, and I probably could have pushed for another 5 or 10. That makes a 100K feel like a reasonable, attainable next goal. I’ll be thinking about this in 2016.

The infamous 100-miler isn’t really on my radar right now. I started getting a little bored with 50 miles. The whole “run all day AND all night” thing just sounds like too much of the same thing.

Maybe one day.

Mountains-To-Sea Trail 50K, Take 2

Sunday I took my second stab at the Mountains-To-Sea Trail 50K race, hosted by Bull City Running.

I ran this race a year ago, and it was my first “ultra”. My goal then, especially having just been in a car accident, was simply to finish.

I did, but I was a little disappointed with my time of 6:14.

My goal this time around was to finish and beat my previous time. And I’m happy to report that I did at about 5:46.

Here are a few random notes, mostly for my own benefit when next year’s race is on the horizon:

  • 10:00/mile felt great for 20 miles. With a LITTLE more training, I think I could have finished stronger.
  • Heed is not my friend. I usually train with water or Nuun (no sugar). I drank a lot of Heed out on the trail and didn’t realize until later that it is full of sugar. I think this is one main reason that I cramped up so badly near the end.
  • Potassium is my friend. Miles 27-29 were pretty awful. My stomach, back, calves, and quads all cramped up in various ways. My quads literally locked up for several minutes. I believe this is at least partly due to a potassium deficiency. Next year, I’ll eat more avocado the day before the race, and more bananas (and fewer fig newtons and twizzlers) during the race.
  • The pain passes. I really wanted to quit for a few minutes near the end. But the pain eventually passed and I felt (relatively) strong again.
  • I am not ready for 50 miles. I still want to do it this year or next year. But I see that I’ll need to get several more long runs in before I can finish a 50-mile race. I honestly don’t think I could have done 40 on Sunday. At least not on hilly trail.

 

Extending Ambari – Stacks and Views

NOTE: I wrote this post way back in March, 2015. Quite a lot has changed in the world of Hadoop. Ambari has gone from 1.7 to 2.2 in that time, and many things have changed. Combining all that I’ve learned in the past year, I’m honestly not sure that Hortonworks ever intended for folks to be developing custom stacks on 1.6 and 1.7. But I’m going to leave this post up for a while, since I think it’s still a decent tutorial and how to get started thinking about customizing and extending Ambari.

 

Apache Ambari is the go-to tool for provisioning, managing and monitoring Apache Hadoop clusters, and it does this job well.

You have a Hadoop cluster with N nodes and you need to install HDFS, HIVE, PIG, Zookeeper, Tez, etc. Thinking about doing this by hand is painful and overwhelming, but Ambari can do this with some minimal information about your cluster and a few mouse clicks.

Once you have your cluster humming along, Ambari also provides a good dashboard for monitoring each service on each node.

Ambari screenshot
Ambari screenshot

But Hadoop is all about data. In the real world, a Hadoop cluster is usually a tool that serves data to a broader application. And that broader application probably needs some sort of installation/monitoring dashboard.

So, it seems that we could:

  • have two separate dashboards for the same application
  • somehow incorporate Ambari into the broader application dashboard
  • somehow incorporate the broader application dashboard into Ambari

As it turns out, all 3 of these options are possible.

Making two separate dashboards is straightforward enough, but comes with a lot of overhead. An engineer monitoring or trouble-shooting the application has to look in two different places. And you end up with two separate sets of code.

Ambari provides a REST API that makes it possible to control/query it from another application, making the second option a possibility.

But I’m interested in the third option:

How can I use Ambari to install, manage, and monitor a 3rd-party / non-Hadoop application?

My vision is to bring up my Ambari dashboard in a web browser and see all the default (Hadoop-related) options, as well as options for things like:

  • Install application X
  • Stats for application X
  • Start/Stop application X
  • Etc

The purpose of this post is describe the hours that I’ve spent trying to make this happen.

Discovering Stacks and Views

I went into this task as an Ambari rookie. I had worked with my team to use Ambari to install a Hadoop distribution on our cluster, but that experience, along with some reading, was the full extent of my experience with Ambari.

So, faced with this task, I went to Google and searched for things like:

ambari monitor external application

or

ambari install external

This got me nowhere.

Hadoop is such a relatively young technology, in such a quickly moving space, that good information is hard to find. Ambari is particularly tough, because it changes so much between versions.

After banging my head on this wall for a while, I gave up on the quick-and-easy path and started studying the guts of how Ambari works. Since Ambari gives you control over child nodes from a master node, it is not a simple architecture. There is an Ambari server application, as well as Ambari client applications that run on each node of a cluster. These, in turn, interact with Nagios and Ganglia . Ganglia is deployed to each client to collect metrics, and Nagios is deployed as an alerting/alarm mechanism.

So ‘extending’ Ambari to applications beyond the Hadoop stack could potentially be a huge undertaking.

I sat on this for a couple of weeks and sought clarification from Hadoop distributors. Do other people do this? Is there a better way?

The answers I received were inconsistent.

One distributor told me unequivocally: This is not supported.

Another said: People do this, but it is beyond the scope of Ambari and you are on your own.

But then someone gave the hint I needed (and hadn’t been able to find on Google):

Look into the “Views” and “Stacks” features.

Bad advice from distributors be damned, what I was trying to accomplish is quite supported.

Ambari provides two separate plugin-style features: one for showing data beyond the Hadoop stack, and one for installing/monitoring services beyond the Hadoop stack.

Ambari Views

The “Views” feature of Ambari is designed to make it possible to ‘extend’ the Ambari dashboard without changing any of the core Ambari code. You simply package up a jar that follows some specific guidelines, place it in the appropriate folder, and restart the Ambari server.

And that’s it: you have new information in your Ambari dashboard.

Once I knew that “Ambari Views” was the magic keyword, finding information to make some progress here was not at all difficult.

Rather than repeat the effort of others here, I’ll point to two good resources:

Ambari Stacks

The “Stacks” feature of Ambari allows your provide Ambari with a “Stack Definition”, which is functionally an interface to install, manage and monitor a set of Services and provide an extensibility model for new Stacks + Services to be introduced.

I’m still wrestling with Stacks. Below, I’ll share what I’ve tried so far. Hopefully it will get you going in the right direction, and if you happen to see what I’m doing wrong, please reply and let me know!

The official Apache Ambari website provides this link for getting started with Stacks. This provides a helpful big picture, but was clearly written by an engineer who is well-acquainted with Ambari and Stacks.

I wasn’t able to get anywhere with it, until I found this helpful post, which supplements the information.

The most important help provided by the second link is this:

Your component must be available as an rpm in your CentOS repo.

Combining the two resources, I tried to build a “Hello World” type of example:

First, I built a HelloWorld.jar file, containing a HelloWorld java program.

Then, I built a Hello_World-1.0.0-1.x86_64.rpm file, which should copy my jar file to a designated folder.

Next, I setup my Ambari server box as a Yum repository, and added the rpm file to the repository.

To test progress up to this point, I did “yum install HelloWorld” and ensured that the appropriate jar file was copied to the appropriate folder. Then I uninstalled so that I could test the process via Ambari.

Next, following the (not totally clear) instructions on the Apache website, I created  /var/lib/ambari-server/resources/stacks/HDP/2.2/services/HelloWorld and copied a variation of the example metainfo.xml file into it, pointing at my installer:

<osSpecifics>
    <osSpecific>
      <osFamily>any</osFamily>
      <packages>
        <package>
          <name>HelloWorld</name>
        </package>
      </packages>
    </osSpecific>
  </osSpecifics>

Then I restarted the Ambari server.

Once Ambari came back to life, I was able to ask Ambari to add my HelloWorld service:

Ambari Stacks Tutorial Step 1

Ambari Stacks Tutorial Step 2

Ambari Stacks Tutorial Step 3

Ambari Stacks Tutorial Step 4

Ambari Stacks Tutorial Step 5

So far, so good.

But then I got an error:

Ambari Stacks Tutorial Step 6

Clicking on “failures encountered” showed me that I needed a “scripts” folder, with ui_server.py, slave.py, and master.py, as described in the tutorial.

No problem.

I added the files, restarted Ambari server, and started the process over again.

But this time, the process simply hung:

Ambari Stacks Error

This screen forever.

No errors in any log that I could find.

Restarting the server or the machine didn’t help.

I even started from scratch with a clean sandbox, in case the previous error state had shot me in the foot, but I have not been able to get farther than this dead-in-the-water state.

After lots of clicking around, I found that once I went to the ‘summary’ page, I had an option to ‘reinstall’, the service, and this actually worked, in the sense that it ran to completion and Ambari then recognized my “Hello World” as an installed service.

However, I was not able to find my Hello-World.jar file at the expected location (specified by the .spec file I used to build my ISO).

At this point, I started digging through the log files looking for a clue.

According to the documentation, Ambari logs to /var/log/ambari-agent and /var/log/amabri-server.

What I found in these folders did not mention my service in any meaningful way.

So I spent some time poking around the file system and found that Ambari logs quite a lot in /var/lib/ambari-agent/data/ .

ambari-logsAs I dug through these files, I found that Ambari puts together a command that looks like this:

ambari-logs2

Notice the repos where Ambari is going to look for my .iso:

[{\"baseUrl\":\"http://public-repo-1.hortonworks.com.s3.amazonaws.com/HDP/centos6/2.x/GA/2.2.0.0/\",\"osType\":\"redhat6\",\"repoId\":\"HDP-2.2\",\"repoName\":\"HDP\",\"defaultBaseUrl\":\"http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0\",\"latestBaseUrl\":\"http://public-repo-1.hortonworks.com/HDP/centos6/2.x/GA/2.2.0.0\"},{\"baseUrl\":\"http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6\",\"osType\":\"redhat6\",\"repoId\":\"HDP-UTILS-1.1.0.20\",\"repoName\":\"HDP-UTILS\",\"defaultBaseUrl\":\"http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6\",\"latestBaseUrl\":\"http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos6\"}]

My home-spun HelloWorld ISO file isn’t in those repos, and I think that may be why Ambari can’t find it.

So, I think the next thing to do is figure out how to configure which repos Ambari will use to find my installer file.

This documentation looks promising for that. I’ll let you know how it turns out in my next post.

Update (3/31/2015): I found this presentation to be extremely helpful on this topic.

Umstead Trail Run

I’m one of those low-tech guys who still runs with an iPhone and MapMyRun.

I was playing on the MayMyRun website and they can generate a Google Earth ‘video’ of your run using the GPS coordinates.

This is really cool.

(The embedded video below requires the Google Earth plugin and you’ll probably have to explicitly give your browser permission to run it)

 

False Starts…

Well, 2015 hasn’t started quite the way I was hoping.

I tried to make good choices. I ran as much as I could over the November-January stretch where I usually get lazy and eat a lot of pie.

I picked a couple of races. Signed up. Made rough training schedules.

But man: these viruses.

There are 6 people in my house and we haven’t all been healthy at the same time since like Halloween.

And long, cold runs and the flu don’t go very well together.

So… I don’t have awesome weekly stats to report right now.

Last weekend I ran 16 comfortable miles. I was planning for 20 this weekend, but I’ve been sick. So… hopefully this week I’ll be able to ease back into it again.

I guess one positive is that the Uwharrie Mountain Run that I was so bummed about not getting a lottery spot for came and went this weekend.

I wouldn’t have been able to run and I would have been REALLY bummed about that.

Got my sights set on the Mountains To Sea Trail 50K in about 6 weeks.

Just gotta get healthy…

Potential 50-Mile Races

After a lot of consideration, I think I have 2 potential 50-mile races picked:

The Pilot Mountain To Hanging Rock Ultra

and

The WC 50

Both look to be extremely technical and hilly, and they are both within a 3-hour drive and promise some great-looking trails.

They are one week apart, so doing both is probably out of the question (even if I pick the 50K version of either of them).

The Pilot Mountain To Hanging Rock race is a one-direction race with a ton of elevation change. I think it would be the far more challenging of the two, but probably the more rewarding as well. The scenery promises to be fantastic.

The trouble would be in training: I simply don’t have the time or resources to run hilly trail often enough to be well-prepared for this race. I would have to just do my best to get in the miles and an occasional gym workout and hope that was sufficient.

The WC 50 is at the National Whitewater Training Center in Charlotte. I’ve spent some time mountain biking there and the trails are good – similar to what I’m used to running around Falls Lake. Trails like this are fairly hilly, but with lots of flat intervals to let you catch your breath.

This race is lap-based, which is sort of a double-edged sword:

Logistically it’s nice because you know what to expect as far as pit stops and you can change shoes/socks/etc every couple of hours. But mentally, it’s no fun to know EXACTLY how much further you have to go. Entering that last 12-mile loop with almost 60K of trail behind will be a serious gut-check.

I guess I need to make a decision and pull the trigger this month…

Run Re-cap (Christmas Break)

I was able to play catch-up a bit on my training over the Christmas holiday.

Over 14 days, I ran 5, 14, 6, 7, 12, 6, 7, and 21.

Each run felt pretty solid, with the exception of the last one: For the first time in a very long time, I was ready to quit by mile 6, and REALLY not feeling it by mile 13. I guess just chalk it up to being out of shape…

But recovery has been solid and I just registered for the “Mountains To Sea Trail” 50K in March.

I did this race last year and really enjoyed it. I’m hoping to be in slightly better shape this year, and if the conditions aren’t too muddy, my goal is to get under the 6 hour mark. Last year was VERY muddy and I finished around 6:15, but actually regretted not starting a little faster.

To Garmin Or Not To Garmin…

With the new year coming, I’ve been considering a switch from using MapMyRun on my phone to using a GPS watch.

Seems like there are lots of options and lots of opinions.

Do you use a GPS watch?

What are the pros?

What are the cons?

I write another post about the ones I’m looking at and, if I make the switch, I’ll blog some thoughts about the experience.

Run Re-cap (4 and 8)

I’ve been thinking about how I want to share my ultra-marathon training on this blog.

I’m not really interested in trying to create any sort of comprehensive resource for runners.

My hope for this blog is that it will be a place where I share the short-and-sweet lessons I’m learning (and re-learning) as I train leading up to my first 50-miler, as well as the resources and tools that I find useful in the process.

If it ends up looking anything like the vision I have right now, it will probably be completely uninteresting to non-runners, and an entertaining and (hopefully) helpful 10-minute-a-week read for folks who put in miles on a regular basis.

Right now, I’m thinking I like the weekly recap of my outings with occasional other posts for things like “I love this resource” or “Who can help me with this issue?” type of stuff.

This week was somewhat special. I’m coming off the longest break from running that I’ve had in at least a year.

That break wasn’t really intentional: my kids brought home some kind of nasty virus and I felt WAY below average for a good 2-3 weeks. As much as I HATED the lethargic feeling of not breaking a sweat for 25 days, I felt like it would be irresponsible to take my already sick body out into the cold to suck in cold air for an hour.

So, Thursday I giddily put on my cold weather gear and knocked out 4.5 miles. I took advantage of the rested state of my feet and  chose my Altra Instinct 1.5‘s and took it nice and easy.

My legs felt SOLID. Like I could run forever. But I had a lot more soreness than usual once I was done. Particularly odd was that my groin was pretty sore for several days, and that’s usually not an issue for me.

Then, Saturday, I took my Brooks Pure Flow 2‘s out for 8 miles. Everything felt great except that I got a blister on my arch. I NEVER get blisters, especially there. But chalk that up to almost 4 weeks without running. The lesson there is: back off and let your body ease back into things. It’s better to take 3 weeks to get back to a comfortable 15 miles than to overdue it immediately and then spend a month trying to manage the injuries you incurred in the process.

Looking to next week: I’m hoping for 3 runs, at least 1 being at least 10 miles.

Permission denied by HDFS

I encountered an interesting hiccup while installing SPARK on our cluster this week.

We have the Hortonworks Data Platform flavor of Hadoop installed on our cluster.

We’ve been using it for a while and all the core installation-y issues have been worked out.

Logged in as ‘root’, calls to ‘hadoop’ and ‘hive’ work fine from the command line.

But after installing SPARK, the “SparkPi” demo program wouldn’t run:

./bin/spark-submit –class org.apache.spark.examples.SparkPi    –master yarn-cluster  –num-executors 6 –driver-memory 2G  –executor-memory 16G   –executor-cores 16  lib/spark-examples*.jar 10

This magic command is supposed to launch the basic Pi calculation program and verify that all the pieces of SPARK are working correctly.

However, when I ran it, I got back this error, followed by a stack trace:

Permission denied: user=root, access=WRITE, inode=”/user”:hdfs:hdfs:drwxr-xr-x

But I’m ‘root’. Doesn’t root have permission to do everything?

Nope. With Hadoop, the ‘HDFS’ user is actually more powerful than ‘root’ – at least in some ways.

OK. So just ‘su HDFS’ and then run the command.

Nope. HDFS doesn’t have the necessary permissions to execute all of SPARK’s stuff.

So what’s the solution?

Well, you need to use the ‘HDFS’ user to give the ‘root’ user permission to write to his folder in the HDFS file system:

su hdfs hadoop fs -mkdir /user/root (this probably already exists if you’ve been using hadoop)

su hdfs hadoop fs -chown root /user/root

Or, better yet, I created a ‘spark’ user, gave him permission to all the SPARK stuff, and then used the ‘hdfs’ user to give him necessary permissions:

su hdfs hadoop fs -mkdir /user/spark

su hdfs hadoop fs -chown spark /user/spark

Then the ‘spark’ user can successfully execute the SparkPi demo (and hopefully anything else)