SPLK-2002 Splunk Enterprise Certified Architect – Distributed Splunk Architecture Part 2

  1. Implementing License Master

Hey everyone, and welcome back. In today’s video, we will be discussing about the Splunk License Pool. Now, basically, whenever you install a Splunk License, that license resides in a license tag, which is also referred as a Splunk Enterprise Stack. We will be looking into it after this slide. Now, this stack has a default License pool called as Autogenerate Pool Enterprise. And any slave that connects to this License Master has access to the default pool. So, let’s understand this by looking into the licensing aspect. So, if you go to the License Master, you would see that this is a Splunk Enterprise stack. So this is a Splunk developer personal license. Do not distribute.

So this is just for the name. So this is a stack. So, if you are basically you have a Splunk Enterprise License, it will be called a Splunk Enterprise Stack. And this stack has a pool. So you see, you have an auto generated pool Enterprise over here, and that pool has a volume capacity of ten GB. So any server which connects to my License Master, it has access to this pool. That means if I connect my License Slave here, that License Slave will be able to use volume of up to ten GB, which is part of the pool over here. So that is what the first three pointers are all about. Now, there is a problem with the default approach here. Let’s understand. So, let’s say you have a licensed Master, and the License Master has up to ten GB of License.

And there are three slaves which are connected, which are the individual Plunk instances. One is used by the development team, second is used by the SRE team, third is used by the security team. Now, when you connect all of these Plunk instances to the License Master, all of these slaves will have access to the Autogenerate pool, which by default has access to the entire ten GB of the volume license.

Now, the problem with this scenario is that if a development team uses eight GB of data per day, that will only leave one GB for the SRE team and maybe one GB for the security team. In case SRE team uses the remaining two GB, then the security team will not have any volume of data that they can index. And this is a problem with the autogenerate pool. So, what we basically do is we create a pool specific for the usage. So I say I create a pool for development team where they can use maximum of three GB. I create a pool for SRE team where they can use maximum of three GB yet again. And I create a pool for the security team where they can use maximum of four GB. So, in such scenarios, you can have that isolation of the licensing volume. So let’s look into how we can do that. So, if you go a bit down the auto generated pool, you can edit this out and you can basically specify the amount that you want to index. So I just create cancel for the time being.

Let’s create a new pool. I’ll name this as Dev pool. I’ll say this is for development team and the license maximum. So the specific amount, I’ll say the development team here, they need to have access only for three GB. So I’ll say three and I’ll put it to GB. And which indexes are eligible to draw from this pool? You can specify the indexer over here. So in our case, if you remember, we had created two licensed slaves. So if I do a docker PS, we are more interested in Splunk slave one. So let’s quickly log in here. I go to it. Splunk slave one. I’ll do a bash and the host name here starts at I’ll say that this specific indexer can access the license pool call as Dead and it has a specific amount of three GB. I can go ahead and I can click on Submit. So now it says that Bad request failed to create pool stack is fully allocated. Now, the reason why this error you are getting is because you already have a default pool.

Let’s look here. You already have a default pool of Auto generated Pool Enterprise, which is currently taking 100% of the license. So you either you can edit this specific pool or you can delete it altogether. So let’s look into how we can edit it. So if I click on Edit, I’ll remove the license maximum here. I’ll say specific amount and I’ll say one GB. You can click on Submit. So once you have done this, then only you can create one more pool. You can associate the indexer here and you can click on submit over here. So now you see the license pool Dev has been successfully created. So I’ll click on OK.

And one more important aspect that I wanted to show you is that let’s try and add one more pool. This time we call it the pool name as SRE. The allocated amount, I’ll say three GB year. And if you look into the specific index, one indexer here is grayed out. So one important point to remember is that at an instant of time, one indexer can only be associated with one pool. It cannot be associated with more than one pool is something that we need to remember.

  1. License Pools

Hey everyone and welcome back. In today’s video, we will be discussing about one of the very important components of Splunk, which is indexer. Now, indexer is a component in Splunk Enterprise whose responsibility is to index data, transform the data into searchable events and then place the results into the appropriate index. Now, generally what happens is that since this is such important component, typically to ensure high availability, splunk provides an out of the box clustering capabilities which most of the organizations who are running Splunk typically opt for. Now, in order to understand the index cert component, let’s take this diagram where you can have multiple inputs, so you can have a monitor input, you can have a FIFO, UDP input, TCP input, scripted inputs, et cetera, et cetera. Now, from these inputs, the data comes from parsing queue.

Now, this is the queue from queue, it comes to the parsing pipeline. Now, in the parsing pipeline, lot of things like source, even tagging the character normalization, the regex transformation, everything happens over here. Now, once everything related to parsing happens, the data then goes to the index queue. From index queue, it goes to the indexing pipeline where the index building happens. And once this is completed, then the data, which is the raw data as well as the index file gets stored within the raw disk. So, there are two stage which generally occurs over here.

One is the parsing stage and second is the indexing stage. So parsing queue and parsing pipeline, they formulate the parsing stage, the indexing queue and indexing pipeline, they formulate the indexing stage. Now, let’s understand the importance of both of these. The first one is the parsing stage where let’s look into some of the actions which happens during the parsing stage. One of the first actions that happens is the extraction of the default fields of each event which includes host, source and source types.

So let’s understand this before we proceed further. So typically, this is my Splunk Enterprise instance and whenever you open any event, you would see that there will always be three fields. One is host, second is source and third is source type. So this is also referred as the default fields. And then you have the event fields here, so you have the default fields and the event fields. This parsing of host, source and source happens at the parsing level. Now, along with that, basically, if you go to settings and if you go into the data inputs, let me quickly open up the diagram.

You see there are multiple inputs from which the data can come into the parsing queue. And this inputs are something that we see over here. You have TCP, UDP, you have scripts, you have a HEC, and you also have a monitor where you can monitor the files and directories. So once this extraction happens, you have various other capabilities which are part of the parsing queue. One is the configuring, the character set, encoding, identifying the line termination using various line break rules which you can define or which Plunk can do it by themselves.

Next, one of the important things that a lot of organization uses is to mask the sensitive details in the data like credit card numbers which might come. So these are some of the functionalities of the parsing stage second is during the indexing pipeline stage. Splunk performs various aspects like breaking events into segments which can later be searched upon building index data structures, which is important for quick search, as well as writing raw data into index files to the disk. So these are some of the aspects of the indexing states. So these are the two important states that happens and all of these are part of the indexer component which takes care of all these aspects.

  1. Indexer

Hey everyone, and welcome back. In today’s video, we will be discussing about the Splunk License Pool. Now, basically, whenever you install a Splunk License, that license resides in a license tag, which is also referred as a Splunk Enterprise Stack. We will be looking into it after this slide. Now, this stack has a default License pool called as Autogenerated Pool Enterprise. And any slave that connects to this License Master has access to the default pool. So, let’s understand this by looking into the licensing aspect. So, if you go to the License Master, you would see that this is a Splunk Enterprise stack. So this is a Splunk developer personal license. Do not distribute. So this is just for the name. So this is a stack. So, if you are basically you have a Splunk Enterprise License, it will be called a Splunk Enterprise Stack. And this stack has a pool. So you see, you have an auto generated pool Enterprise over here, and that pool has a volume capacity of ten GB.

So any server which connects to my License Master, it has access to this pool. That means if I connect my License Slave here, that License Slave will be able to use volume of up to ten GB, which is part of the pool over here. So that is what the first three pointers are all about. Now, there is a problem with the default approach here. Let’s understand. So, let’s say you have a licensed Master, and the License Master has up to ten GB of License. And there are three slaves which are connected, which are the individual Plunk instances. One is used by the development team, second is used by the SRE team, third is used by the security team.

Now, when you connect all of these Plunk instances to the License Master, all of these slaves will have access to the Autogenerated pool, which by default has access to the entire ten GB of the volume license. Now, the problem with this scenario is that if a development team uses eight GB of data per day, that will only leave one GB for the SRE team and maybe one GB for the security team. In case SRE team uses the remaining two GB, then the security team will not have any volume of data that they can index. And this is a problem with the autogenerated pool. So, what we basically do is we create a pool specific for the usage. So I say I create a pool for development team where they can use maximum of three GB. I create a pool for SRE team where they can use maximum of three GB yet again. And I create a pool for the security team where they can use maximum of four GB. So, in such scenarios, you can have that isolation of the licensing volume. So let’s look into how we can do that. So, if you go a bit down the auto generated pool, you can edit this out and you can basically specify the amount that you want to index. So I just create cancel for the time being. Let’s create a new pool. I’ll name this as Dev pool. I’ll say this is for development team and the license maximum. So the specific amount, I’ll say the development team here, they need to have access only for three GB. So I’ll say three and I’ll put it to GB. And which indexes are eligible to draw from this pool? You can specify the indexer over here. So in our case, if you remember, we had created two licensed slaves. So if I do a docker PS, we are more interested in Splunk slave one. So let’s quickly log in here. I go to it. Splunk slave one. I’ll do a bash and the host name here starts at I’ll say that this specific indexer can access the license pool call as Dead and it has a specific amount of three GB. I can go ahead and I can click on Submit.

So now it says that Bad request failed to create pool stack is fully allocated. Now, the reason why this error you are getting is because you already have a default pool. Let’s look here. You already have a default pool of Auto generated Pool Enterprise, which is currently taking 100% of the license. So you either you can edit this specific pool or you can delete it altogether. So let’s look into how we can edit it. So if I click on Edit, I’ll remove the license maximum here. I’ll say specific amount and I’ll say one GB. You can click on Submit. So once you have done this, then only you can create one more pool. You can associate the indexer here and you can click on submit over here.

So now you see the license pool Dev has been successfully created. So I’ll click on OK. And one more important aspect that I wanted to show you is that let’s try and add one more pool. This time we call it the pool name as SRE. The allocated amount, I’ll say three GB year. And if you look into the specific index, one indexer here is grayed out. So one important point to remember is that at an instant of time, one indexer can only be associated with one pool. It cannot be associated with more than one pool is something that we need to remember.

img