The term “Big Data” has been flinging around quite a lot lately. It is in the news all the time. We hear about how much it has pushed us into the future and into the internet of things. These things all will produce useful data that will need to be analyzed and stored. One technology that we hear more and more about is Hadoop.
Hadoop was birthed as an open source project from the Google filesystem (GFS), and Map Reduce white-papers; the creator is Doug Cutting and the open source community. Map reduce is the core of Hadoop, and allows the user to write very simple programs to distribute workload across a complex amount of data. The Google filesystem inspired the majority of the work for the open source Hadoop filesystem (HDFS). HDFS is a redundant filesystem written in Java that distributes data across multiple machines that can be analyzed using Map reduce programming. That is just a brief dive into what Hadoop is, and if you want to learn more I highly recommend you take a gander at the Yahoo Hadoop tutorial.
Here is an ecosystem filled with projects that make managing this complex monster easier on administrator’s and developer’s. One of these projects that I really enjoy is Hue, the Hadoop User Experience. It gives a web interface for the user to query their data using some of these projects that live in this big data ecosystem like:
Each of these tools sits in front of a plethora of data that the user is analyzing. This data can be anything from a company’s customer generated data that tells a music service what song to play next, to another company trying to figure out which ads to serve you based on your browsing history. My point being — Hue has access to some seriously valuable information.
As with most technologies, security is often an after-thought. It is important we test the security of these applications so that we can protect my data and your data from the evil-doers who will sell the same information or use it for awful things. Perhaps a criminal can use pilfered data about you to create malware that you will more easily fall prey to.
cPanel has had a very large impact on the hosting industry. This single company has enabled people to build their dreams overnight with $5 — the American dream. cPanel’s largest offerings to the industry have been the cPanel/WHM web-server management software. It’s actually pretty stellar software, and offers the systems administrator an abundant amount of tools to just get shit done on a large scale. The shared hosting market is huge — no, it’s colossal.
Shared hosting is essentially stuffing users onto one server allowing them to share the servers resources. I’ve seen cPanel servers with well over 1000 users. To an outside security researcher this would look like a rich opportunity to take one machine, with a very large reward. With cPanel each customer could have more than one website hosted in their account(sharing the same IP), meaning if even only a few accounts were compromised some-how in the shared stack the amount of data that could be at risk is pretty scary.
So, how could someone compromise a shared cPanel server, or at least enumerate it’s users? Well with Science of course!
Unfortunately it’s become a common practice to use incremental hostnames for several different reasons. As I’ve written in other posts about enumerating subdomains, a lot of these results coming back contain an incremental naming structure. This is interesting, because one could use the incremented naming structure to map and locate a lot of information about a particular group of servers within a given infrastructure(a production distributed MySQL farm for instance; db1, db2, db3). This can be used as a powerful reconnaissance tactic.
How does this happen? Why do we use this naming system for servers that will expose them to potential security issues? It’s pretty common for scalable systems to assign an incremented hostname upon instance creation when being auto-scaled.
For example; let’s say that we know the hostname of a server that looks like it could be incremented.
dustin@atxsec ~ $ host server1.gamingservers.local
server1.gamingservers.local has address 10.2.3.4
dustin@atxsec ~ $ host server10.gamingservers.local
server10.gamingservers.local has address 10.2.3.13
dustin@atxsec ~ $ host server11.gamingservers.local
server11.gamingservers.local has address 10.2.3.14
Now..I wonder what would happen if..
Read the rest of this entry »
There are a ton of short-url generating services out there like tinyurl and even Google’s url shortening service. One of my favorite of these services is a website called Shoutkey.com. What is so special about Shoutkey?
Well, Shoutkey by design uses words that could be found within a standard English dictionary. Meaning when you create a short-link using the website the end result isn’t some randomly generated URL — it’s based on a dictionary word. On the record: I’m a huge fan of Shoutkey. Let’s look at an example of these two types of services side by side and observe how their link structure vary.
a Tinyurl example
a shoutkey example
One might use Shoutkey as compared to Tinyurl so they can share the word with the person instead of sending them the direct link; the short-url can be easily shared via speech. Now, this is obviously flawed as in theory someone could automate creating short links to build a dictionary from Shoutkey so they could test it for valid links.
Nah, that’s too much time to dedicate and most likely wouldn’t even work…or would it?
Read the rest of this entry »