SPLK-2002 Splunk Enterprise Certified Architect – Post Installation Activities

  1. Understanding Regular Expressions

Hey everyone and welcome back. Now in the earlier video we were looking into how we can create our own custom addon which would have inputs, conve and outputs convey that addon through the deployment server and look into how exactly the universal forwarder uses that to send the log files. However, when it comes to universal forwarder, it is not just limited to monitoring log files, it can add actually do a huge bunch of things. And today’s video is exactly to see the capabilities of universal forwarder rather than just monitoring a log file.

So let’s look into some of them. So for our demo purposes, what we’ll be doing is we’ll be using the official Splunk addon which Splunk provides and we’ll be pushing that through the deployment server through the universal forwarder. So within the Splunk base when we type Linux, there is a Splunk add on for Unix and Linux and you see it is built by Splunk. So basically we are more interested in downloading this specific add on. So if you go a bit down you can go ahead and download it. So just download this app, click on Agree to download and it will go ahead and download it within your directory.

So currently it is downloaded in my test directory. So this is plunk add on for Unix and Linux and what we’ll do is we will move this add on inside a docker container where our Splunk is running. So in order to do that, I am in my CLI. So I’ll go to the test directory and if I do a dir you will see that I have Splunk add on for Unix and linux. Let’s do a docker CP splunk add on and I’ll move it to the docker container inside a temp directory. So once it’s inside the temp directory, I’ll log into my docker instance. Let me quickly log in and if you go to the TMP directory you will see that I have Splunk add on for Unix and Linux. So let’s quickly extract this off. So I do a tar xzvf splunk add on and it has extracted. So if I do a LS hyphen l, you will see that you have Splunk tare Nix.

So we know that if we want to push a specific add on from the deployment server to the universal forwarder which is the directory that we need to move into, so I’ll do a move on Splunk tlnix and I’ll put it to opt Splunk etc deployment app. So once I have moved it into deployment apps, if you go to the Splunk instance, you go to the forwarder management and now within the apps you would see that you have one more app called Splunk ta underscore Nix. Now this app has a huge amount of capabilities other than just pushing the log files and we’ll look into what these capabilities are. So before we do that, let’s click on Edit over here within the server class I’ll say Linux underscore secure I’ll select after installation, just restart the Splunk D and I’ll go ahead and I’ll save the changes.

So now within the clients you will see that now you have two deployed apps, you have Splunk Ta, Nix and KPL’s underscore Linux. So if you’ll see so fast the new add on have been deployed and this is the reason why I really love deployment server, it’s extremely fast and pretty convenient to push new add ons that you might have. So now, once the app is deployed, if you quickly see you have both Splunk ta for Nix and Kplash Linux apps have been deployed. So if you quickly log into the instance and go to opt Splunk etc apps and let’s do an LS, you should see that there is a Splunk ta underscore Nix add on which is present. So let’s go here and within the default you would see that there are a lot of files which are present, one among them is inputs conve.

Now if I open inputs conifer typically you will see that there are a lot of inputs which are present over here and every input has a status of Disabled is equal to True. So Disabled is equal to True basically means that even though this input is configured within this inputsconf file, this input is in the disabled state because there’s a Disabled is equal to True. So what we basically need to do is we have to change this stanza from disabled is equal to true to disabled is equal to false. So what we’ll do is we’ll copy this inputs convex and let’s move it to the local directory and now if I go inside the local directory we have inputs conifer and we’ll go ahead and edit these inputs convey. So let’s change certain configurations here so you can see disabled is equal to zero.

So I’ll just select some random one for our testing purpose and once you have done that, once you have selected any random ones and you have done Disabled is equal to zero you can go ahead and save the script and will quickly do a Splunk restart. So once it is restarted, let’s go to a Splunk instance and we’ll go to the search and reporting app and basically if you do a data summary and within the sources you see that there are a lot of new sources which has come up like Dfnets that open ports, packages, PS, others as well.

So let’s click on PS and here what you’ll basically do is you basically get the list of packages which are available or this is the process output my back, this is the process related details. Now in order to get more information, let’s go back to the sources and let’s go to the package and this is basically the package information that you have from Splunk. So basically there are 133 lines and these are all basically the packages like bin utils, you have Groff and various others. So this is one again you need a proper formatting and you need the proper passing. But we are just exploring on what are the details that we are basically getting.

So since all of these details we are receiving from the forward to instance let’s click here and basically within the source type there will be a lot of source types. You have PS. You have Top. So Top is something which I’m sure everyone would be aware. So when you basically run top command so you basically get list of all the processes by CPU, memory time, you have priorities then a nice value as well as well as the PID and the user from which it is running from.

So you get the same output over here. So basically what Splunk Universal forwarder does is that not only it has the capability to just monitor log file but it also has the capability to run various custom scripts that you might define. So this is the high level overview about the capabilities of the Splunk Unix add on and how you can utilize it to fetch various information from a universal forwarder. Now, in the next video we’ll explore the directory structure of Splunk add on for Unix and Linux and we looked into the things that it is actually doing behind the scenes and how we can optimize it more further.

  1. Parsing Web Server Logs & Named Group Expression

Hey everyone and welcome back. In today’s video we will be discussing about regular expressions. Now regular expressions which are also referred as the reg x is a very important topic specifically during the parsing of logs. Basically if you do not have a proper field extractions and if your logs are not parsed then it would be meaningless and you will not be able to find a needle in in the haystack. So this is the reason why having the understanding of regular expressions is important. And in today’s video we’ll have a high level overview about regular expressions. Now a regular expression also referred as reggae is a sequence of character that defines a search pattern. Now this is a very simple search pattern which says there is a rainbow which arises on south shore of Mumbai. Now if you will see over here when I just type rainbow. So this is referred as the literal character which means just literally find me this specific word within the sentence. So also referred as a literal character. However, there is also something which is referred as meta character. So meta character is a character or a sequence of characters that has a special meaning which provides information about other characters. So there are various meta characters that you can use in your regular expression. Definitely this is a very small list. There are dedicated books written for regular expressions but in today’s video we’ll just cover the high level overview basics so that when we look into regular expressions which are built in in various plunk add ons we should not get confused there. So let’s do one thing, I’ll copy this sentence and I’ll paste in this website which is reg x 10 one. So what you basically do, you paste your string here and you can write your regular expression and it will show you the matching information.

Now on the right hand side it has certain help document where you can go with meta sequences, general tokens, you can go with all tokens. So you want to see what are the reggae that you can use. This is a nice reference here. So as we had already discussed, if you just type rainbow over here so it will just match a literal character. So this is basically a literal character which is rainbow. However, this is something which is quite easy and every one of us already knows about it. So the first thing that we’ll be looking into is the dbagslash D and the backslash W. So basically stands for any word, it can be capital A to Z, small A to Z or zero to nine. So if I do a W basically you will see it is capturing each and every individual word. So let me do one more time and it has capturing two words now so if you want to capture the first which is there which basically contains five words so you can type www five times. So basically it will only capture any word which contains five characters. So on the right hand side, you see it is matching there, it is matching rain B which arise South Shore and Mumba. So if I just do one more w now, you see, it has captured the entire Mumbai word.

So this is something that you can use. Now, instead of typing this entire string, what you can do is you can say like this. So within the brackets you can put five and the meaning would actually be the same. So if I put six over here now, you see entire string is being captured. So you don’t really have to write a super long regular expressions. So let me remove this. Now, you can use something like A to Z within your square brackets. So what this will basically do is it will match any character which is A to Z, between A to Z in capital. So you see, the capital T has been highlighted, capital R, capital M and so on. Now, if I replace it with small A to Z, it will basically start to capture the small ones. It will not capture any capital character which is present. You can also do something like this. So what will happen here is that it will not capture anything which is between A to Z over here, right? So this is like it will not capture anything which is in small case A to Z. Now, you can also change to something like capital A to Z. And basically within the full match over here, you will get everything except the characters or the words which are starting with a capital A to Z.

So this is something that you can utilize. So this is the basics, very basics of regular expression. I hope you started to understand the basics on what regular expressions are. So within our first example, this example contains sample data which has a use case of matching phone numbers. So I have two phone numbers over here and my aim is to write a regular expression which can match both of these number list. So let me copy this up and I’ll remove the regular expressions and I’ll paste the phone numbers over here. So basically, you need to write a regular expression which can match both of these numbers. Now, if you look into the list here for any word, you can do backslash W for any digit, you can directly even use backslash D. So if I do a backslash D, you see it starts to capture an individual digit. Let me try one more time and let me try one more time. All right, so now you have the first three digits which is 102 545955. So there will be six matches that you would typically see. Now, in order to match the first expression over here, that contains the Hyphen.

So I’ll say Hyphen and if I do DD, and let me put one more Hyphen, I say DDD. So basically this will have a full match on the first data sample. But since the second data sample does not really have any Hyphen, basically the second one will not match. However, this is something that we do not need. So basically what you do is you have a special meta character called a dot.

Dot basically signifies any character. All right, a very important part to remember. So instead of Hyphen you can put a dot here and instead of Hyphen here you can put a dot. So now you see both of them are now matching because your dot basically signifies any character. So let’s add one more sample, say 100 and 2152 hundred. Now you see it also got matched primarily. So it might happen that this is not a phone number. But since you have added a dot here matches Hyphen, it can match this specific dot character, it can match the asterisk character as well. So let’s say for example, this is not a phone number and you just want to match the first and the second one. You don’t want to match this specific ones. So what the requirement is that any number which contains Hyphen or dot, only those needs to be counted. You do not want this specific sample data to be counted. So if you look into a sample graph, you have a square brackets which says matches characters in brackets.

So basically what you can do over here is that after your three digit you can add hyphen, you can add a dot. And let me remove a dot here. And what it basically does is that after three digits it will look for either Hyphen or either dot. And here you see, it only captures the first and the second one, it does not really capture the third one. So this is the regular expression that you can utilize. Now just to optimize this further, you can even make use of the brackets which actually makes things much more easier to understand. So let’s quickly do that. I’ll replace by three and I’ll replace by three again. So now you see it becomes much more easier and it becomes much more smaller rather than writing backs less d multiple times. So this is a much more efficient regular expressions.

So in our example two, we have a sample data over here. So this is a sample data and we basically need to write a regular expression which can match the entire data set over here. So let’s do one thing, I’ll copy the entire data set and I’ll paste it within our regular expression editor. So let’s go ahead and write a regular expression here. So the very first thing that maybe you can do is if you type Mr, you will see that it is matching the first four sample data. However, the challenge here is that in the first sample data you have a dot, in the second you do not have a dot. In the third you have an Mrs and in the fourth you have Mr followed by letter L and in the fifth you have Ms. So what you want to do, let’s say you want to match this specific sample data. Now, we have already discussed that if you use a dot, dot basically means any character.

So if I use a dot over here, it actually matches various sample data because it also considers space and it also considers this sample, where you have an S over here, which is Mrs. So if you just want to match this specific string, what you need to do is you have to do an escape sequence. So escape sequence basically if you just want to actually capture the dot character this is where you can do an escape sequence and you can put a dot over here. Now this will only match this. Now, what you basically want to achieve is that you not only want to match this, but you also want to match the data, which does not have this dot. So you can put a question mark over here and it will start to capture the data which does not have a dot. So this basically means capture anything which is Mr dot or it does not have a dot. All right. Now, after Mr, basically you have a space.

So for space, you have the backslash s. So the backslash s basically is for white space. So what we’ll do is we’ll do a white backslash s. Now, you see that space has been accounted for. So now, after space. You see, I have Z, I have hersh. So basically, these can be considered as a word. So if I put a slash w. It has started to capture the first character, which is Z-H-L. However, it might happen that you have the entire word. So what you can do is you can make use of plus. So when you make use of plus, you see it actually considered the entire characters which are after this specific. W which is here. Mr. L also is considered. Now you can even use asterisk depending upon the use case, you can use one of them. So we’ll use word and some of them are now highlighted. But if you see over here, the expression Mrs. Surreyka and Ms. Alice is not being accounted for. So basically, what we want is we want to have a regular expression which can even fulfill the Mrs. And the Ms. Field.

So what? You can basically do is you can create a group which basically says anything which starts with M. And after M, it can have R, it can have S, it can have Rs. And once you do this, you see you have Mr. Z, Mr. Harsh, Ms. Surreyka, Mr. L and Ms. Alice. So all of these are being accounted for. So you have mr. Alice, mr. L. Surika Harsh and Mr. ZEEL.

So this is or condition basically that you are putting? So these are certain basics of what regular expressions are. Now, when you learned regular expression, there are entire books which are dedicated for regular expressions and we basically do not want to do entire course on a regular expression right now because this course is not dedicated for that. But we did the basic ones so that when we see a regular expression, we should not get confused on what exactly it is all about. And this is the reason why I decided to have a basic video, so that we understand what these regular expressions are all about.

img