Automating End to End Acquisition and Processing - Velociraptor and TimeSketch

Partly out to curiosity and partly out of necessity, I was looking around for resources on automating functions across DFIR acquisition and processing pipelines. I stumbled across two interesting resources detailing automation across different parts of the process. The first was a great post by Bryant Cabantac on Deploying Velocriptor in AWS using CloudFormation — Scaling forensic acquisition which detailed a basic deployment of Velociraptor using CloudFormation on AWS. The second was fantastic SANS Summit talk by Eric Capuano and Whitney Champion who built out a pipeline of acquisition using Velociraptor with processing performed by Plaso and timesketch, the talk and associated artifacts can be found on Recon InfoSec’s GitHub here.

Both great endeavors in their own right, I wanted to extend on this further by bringing both together to building out an automated deployment of Velociraptor, timesketch and associated supporting infrastructure.

The goal here was to create a fully ephemeral environment that could be deployed at a moments notice and deleted once no longer required. The use case for this is tailored more towards those performing DFIR across multiple organisations and only require the environment for the life of an incident response engagement.

The plan was to extend and build out functionality in a few places including:

Deployment of Velociraptor and timesketch via CloudFormation.
Deployment of supporting configuration via CloudFormation.
Adding in additional deployment features including subdomain registration and TLS configuration using Let's Encrypt.
Adding in support for EBS volumes allowing for deployment into larger environments.
Add in auto AMI selection and other quality of life auto configuration.

Throughout this post I’ll cover how I went about building this environment including deployment of AWS components via CloudFormation, post CloudFormation deployment activities and collection via Velociraptor.

Keep in mind this is all setup in a lab environment and not production ready. Careful consideration should be taken when deploying in this manner for real world use cases. CloudFormation can also be extended to include additional AWS components including VPC, Subnet ,S3 bucket setup, WAF setup and more.

Additionally, Velociraptor and timesketch are both incredibly powerful tool in their own right and can be used in a standalone scenario. I’d highly recommend checking them out.

TL;DR

Automated the deployment of Velociraptor and timesketch using AWS CloudFormation. Deployed functionality to perform acquisition and automatic processing of artifacts with ingestion into timesketch.

All templates available on GitHub.

Overview

Keeping the goal in mind of automating end to end artifact aquisition and processing, lets jump into my approach with a brief overview of the components in play, how I’ll break these up into manageable segments for automation and a depiction of the end to end process.

Velociraptor will be the primary artifact acquisition method, for an overview of Velociraptor and how to perform a basic setup check out my previous post on a basic Velociraptor deployment using AWS. The difference here is we will be using CloudFormation for the deployment. Velociraptor clients will sit on endpoints for interaction and collection. Velociraptor will be deployed on an EC2 instance with appropriate security groups, networking, storage and access setup.

Plaso/Log2timeline is a Python-based engine used for the creation of various timelines. log2timeline is a command line tool to extract events from individual files, recursing a directory, for example a mount point, or storage media image or device. log2timeline creates a plaso storage file which can be analyzed with the pinfo and psort tools. You can find additional info on Plaso here.

Log2timeline will be the middle of this equation taking artifacts acquired from endpoints using Velociraptor and parsing them for ingestion into timesketch. There are some behind the scenes pieces here including moving artifacts, storage of artifacts pre and post parsing and serving them to timesketch which we will go into in further detail later in this post.

Timesketch is an open-source tool for collaborative forensic timeline analysis environment and our final piece of the puzzle. We will use timesketch to do the heavy lifting by processing our plaso files, ingesting them and presenting them to analysts for consumption.

The below is a simplified overview visualising the flow. The diagram neglects finer details including S3 buckets, code repositories, and other underlying AWS constructs as we will dive into these throughout this post.

Design

The lab design has been kept simple as this is a demonstration of the concept and not designed for production deployment. Making up the environment there are several static components not deployed via CloudFormation. This was purely a design choice and could be incorporated into the stack templates or their own template if desired.

If you are looking to recreate this setup and deploy into your own AWS account you will need to create the following at a minimum prior to deploying the CloudFormation stacks:

Singular VPC. where both timesketch and Velociraptor will be deployed.
Subnet - Associated with the VPC.
Internet gateway - Associate with VPC.

The CloudFormation stacks will deploy the rest of the components including the following:

Two EC2 instances running both timesketch and velociraptor.
IAM user and policies for S3 Bucket Access.
Security Groups controlling EC2 instance access.

Subsequent sections will explore components of the EC2 instances and CloudFormation used to create them.

The CloudFormation written to deploy both applications also deployed or modified a number of auxiliary components. Modification or creation of these component settings contributed to the operation of both applications. The following is a brief summary of configured components:

KeyPair - Used for SSH onto the EC2 Instance.
EBS Volume - Extended capacity of the EC2 instance allowing to cater for larger deployments.
ElasticIP - Public IP for external communication (automatically assigned)
DNS Record (Route53) - Subdomain A record within an existing hosted zone. Supports Velociraptor client deployment and Let’s Encrypt registration.
IAM Policy - Controlles S3 bucket permissions.
VPC - Placed EC2 in VPC.
Security Group - Controls access to/from EC2 instance.

Velociraptor

The following will explore the CloudFormation template used to deploy Velociraptor. I’ve broken the template into section with explanations of each. Check out my GitHub page for the full template.

Parameters

The following parameters were used within the Velociraptor CloudFormation Template. These are fairly straightforward and will feed into the various resources when creating the EC2 instance and associated components. Think of these as variables, we can modify these at run time with user input, handy if you want to modify elements of the build like EC2 size, EBS volume size or keypair as a few examples.

The “parameterLatestAmiId” will select the latest AWS Linux AMI removing the need to update the CloudFormation template when new versions get released

parameterSSHKeyName:
    Description: Amazon EC2 Key Pair for Velociraptor instance
    Type: "AWS::EC2::KeyPair::KeyName"

parameterVPC:
    Description: VPC ID to deploy Velociraptor instance to
    Type: "AWS::EC2::VPC::Id"

parameterSubnetID:
    Description: Subnet ID to deploy Velociraptor instance to
    Type: "AWS::EC2::Subnet::Id"

parameterInstanceType:
    Description: Velociraptor EC2 instance types
    Type: String
    Default: t2.medium
    # Additional values - These can be changed - Recommend removing micro, nano for real world deployments
    AllowedValues:
        - t2.micro
        - t2.nano
        - t2.small
        - t2.medium

parameterEBSSize:
    Description: Velociraptor EC2 EBS Volume Size
    Type: Number
    Default: 60 # Change depending on size of environment
    
parameterLatestAmiId:
    Type : "AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>"
    Default: /aws/service/ami-amazon-linux-latest/al2023-ami-kernel-6.1-x86_64

Resources

Resources section declares the AWS resources that you want to include in the stack, such as an Amazon EC2 instance or an Amazon S3 bucket. Resources deployed as part of the Velociraptor CloudFormation stack are explored in detail below.

IAM Policy

The “resourceVRS3BucketsPolicy” IAM policy allows the EC2 instances to access the desired S3 bucket. In this case the Velociraptor instance is allowed to list and get objects from the bucket. The purpose of this is to allow for retrieval of Velociraptor server configuration files for server deployment.

There is also additional IAM roles, policies, a user and associated access key created in this stage. This will be used later on by our Velociraptor instance to interact with the S3 bucket. It will allow Velociraptor to upload KAPE collections as zip files to S3 for download, parsing and ingest into timesketch. The CloudFormation code below also saves the key and secret in credential manager for later access if requried.

When replicating this configuration careful consideration should be put towards the “Resource” field as this is where the target bucket is determined. This should be restricted only to your specific Velociraptor bucket or folder. The below uses * as a wild card match for demonstration purposes.

resourceVRIAMUser:
    Type: AWS::IAM::User
    Properties:
        Path: / #optional path value, can use this to reference the creds in bulk
        UserName: !Ref AWS::StackName

resourceVRIAMUserCredentials:
    Type: AWS::IAM::AccessKey
    Properties:
        Status: Active
        UserName: !Ref resourceVRIAMUser

resourceVRCredentialsStored:
    Type: AWS::SecretsManager::Secret
    Properties:
        Name: !Sub /velociraptor/credentials/${resourceVRIAMUser}
        SecretString: !Sub '{"ACCESS_KEY":"${resourceVRIAMUserCredentials}","SECRET_KEY":"${resourceVRIAMUserCredentials.SecretAccessKey}"}'

resourceVRIAMUserRole:
    Type: "AWS::IAM::Role"
    Properties:
        RoleName: 'resourceVRIAMUserRole'
        Path: "/"
        
        AssumeRolePolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Principal: 
                AWS: !Sub 'arn:aws:iam::<AccountID>:user/${AWS::StackName}'
              Action: sts:AssumeRole 
    DependsOn:
       - resourceVRIAMUser 
       
resourceVRIAMUserRolePolicy:
    Type: "AWS::IAM::ManagedPolicy"
    Properties:
        ManagedPolicyName: 'resourceVRIAMUserRolePolicy'
        Roles:
            - Ref: "resourceVRIAMUserRole"
        PolicyDocument:
            Version: "2012-10-17"
            Statement:       
            - Effect: Allow
              Action:
              - s3:Get*
              - s3:Put*
              - s3:List*       
              Resource:
              - "*"
    DependsOn:
      - resourceVRIAMUser

resourceVRS3BucketsRole:
    Type: AWS::IAM::Role
    Properties:
        AssumeRolePolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Principal: 
                Service:
                - ec2.amazonaws.com
              Action:
                - sts:AssumeRole
        Path: "/"

resourceVRS3BucketsInstanceProfile:
    Type: AWS::IAM::InstanceProfile
    Properties:
        Path: "/"
        Roles:
        - Ref: resourceVRS3BucketsRole

resourceVRS3BucketsPolicy:
    Type: "AWS::IAM::Policy"
    Properties:
      PolicyName: resourceVRS3BucketsPolicy
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          -
            Action: s3:GetObject #Important! Restrict to your bucket here.
            Effect: Allow
            Resource: "*"

          -
            Action: s3:ListBucket #Important! Restrict to your bucket here.
            Effect: Allow
            Resource: "*"
      Roles:
            - !Ref resourceVRS3BucketsRole

Security Group

Security groups have been setup with open permissions. This has been done for lab and testing purposes, it is advised these are tuned accordingly. One interesting configuration option could be using a parameter for analyst input on stack creation, for example the public IP of the destination client deployment environment, and only restricting inbound access from this.

A few things to keep in mind, Let’s encrypt requires inbound connectivity on 80 for cert registration, this can be restricted to a specific domain name. Client and GUI access needs to be accounted for here. SSH can and should be restricted to internal only.

It is advisable that, at a minimum, external access is restricted to a specific public IP or preferably only allowed via private IP (in my case to AWS client VPN users). The same can be done for Velociraptor with exceptions added for Let’s Encrypt and GitHub interaction.

Alternatively AWS Session Manager could be deployed for access.

resourcePublicAccessSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
        VpcId: !Ref parameterVPC
        GroupDescription: Allows SSH, HTTP, and HTTPS access to Velociraptor instance
        GroupName: !Sub '${AWS::StackName}-velociraptor-ssh-access'
        SecurityGroupIngress:
            -
                CidrIp: '0.0.0.0/0'
                Description: 'Replace this rule on more strict one' # Change and acess via clientVPN - Left for lab purposes
                FromPort: 22
                ToPort: 22
                IpProtocol: tcp        
            -
                CidrIp: '0.0.0.0/0'
                Description: 'HTTPS web GUI' # recommend lock down to specific IP list
                FromPort: 443
                ToPort: 443
                IpProtocol: tcp    
            -
                CidrIp: '0.0.0.0/0'
                Description: 'Agent Front End' # Restrict to public address of agent deployment environment
                FromPort: 80
                ToPort: 80
                IpProtocol: tcp                        
        Tags:
            -
                Key: Name
                Value: !Sub '${AWS::StackName}-velociraptor-ssh-access'

Route53

DNS configuration is required for client connectivity when deploying Velociraptor. In addtion to this I also wanted to leverage Let’s Encrypt for TLS which requires a domain name. Manual configuration is possible however I decided to work this into the CloudFormation template.

The code will automatically register the designated subdomain within the hosted zone of your choosing. The IP for the A record is directly obtained from the EC2 instance created, this ElasticIP is dynamically assigned.

An alternate design to give more flexibility could be to use parameters allowing users to nominate the hosted zone and subdomain at stack runtime. One important thing to keep in mind is the Velociraptor configuration file. When generating the config you set a domain name, the domain name you select must match, leaving this as a variable parameter could cause conflicts if sub domains nominated don’t match.

To solve this you could use Velociraptors options for automatic server config, I’ll save this for another upcoming blog post.

resourceSubdomainDNSRecord:
    Type: 'AWS::Route53::RecordSet'
    Properties:
        HostedZoneId: <HostedZoneID - Here> # Hosted Zone ID of your indended domain
        Name: "<subdomain here>" #Uses a set subdomain (matches velociraptor generated server config file)
        ResourceRecords:
            - !GetAtt 
                - resourceVelociraptor
                - PublicIp
        TTL: '300'
        Type: "A"

EC2 Instance

The configuration below deploys our Velociraptor EC2 instance calling on some of our parameters defined earlier including AMI selection, InstanceType, VPC and Subnet. An EBS volume is already created and attached, sizing of this EBS volume is defaulted to 60GB but changeable through parameters during stack creation.

The most notable section in the below configuration code is the UserData property, this is data we can pass to the EC2 instance for execution upon startup. The commands perform the following actions:

Initialise the EBS volume and mount to /data.
Copy the per-requsite Velociraptor configuration files from our S3 bucket.
Download the latest Velociraptor release from Github
Adjust permissions for execution
Copy service configuration files from our S3 bucket
Enable the Velociraptor service

Another way of doing this could be to create post installation scripts stored in a code repository or S3 bucket to be called and executed during startup.

resourceVelociraptor:
    Type: AWS::EC2::Instance
    Properties:
        # Amazon Linux 2 AMI ap-souteast-2
        IamInstanceProfile: !Ref resourceVRS3BucketsInstanceProfile
        ImageId: !Ref parameterLatestAmiId #default Linux AMI - update as necessary
        KeyName: !Sub '${parameterSSHKeyName}'
        InstanceType: !Sub '${parameterInstanceType}'
        UserData:
            Fn::Base64:
                Fn::Join: [ "", [
                    "#!/bin/bash -xe",
                    "\n",
                    "sudo sleep 60",
                    "\n",
                    "sudo mkfs -t xfs /dev/xvdf",
                    "\n",
                    "sudo mkdir /data",
                    "\n",
                    "sudo mount /dev/xvdf /data",
                    "\n",
                    "sudo mkdir /data/velociraptor",
                    "\n",
                    "sudo aws s3 cp s3://s3-velociraptor-prod/vr-server-config/server.config.yaml /data/velociraptor/",
                    "\n",
                    "sudo wget https://github.com/Velocidex/velociraptor/releases/download/v0.7.0/velociraptor-v0.7.0-4-linux-amd64 -P /data/velociraptor/",
                    "\n",
                    "sudo sleep 60",
                    "\n",
                    "sudo chmod +x /data/velociraptor/velociraptor-v0.7.0-4-linux-amd64",
                    "\n",
                    "sudo aws s3 cp s3://s3-velociraptor-prod/velociraptor.service /lib/systemd/system/",
                    "\n",
                    "sudo systemctl daemon-reload",
                    "\n",
                    "sudo systemctl enable --now velociraptor",
                    ]
                  ]
                  
        NetworkInterfaces:
            -
                AssociatePublicIpAddress: true
                DeviceIndex: "0"
                GroupSet:
                    - !Ref resourcePublicAccessSecurityGroup
                SubnetId: !Ref parameterSubnetID
        Tags:
            -
                Key: Name
                Value: !Sub '${AWS::StackName}-velociraptor'
                
        Volumes:
            -
                Device: "/dev/sdf"
                VolumeId: !Ref resourceNewVolume


 resourceNewVolume:
    Type: AWS::EC2::Volume
    Properties:
        Size: !Ref parameterEBSSize
        AvailabilityZone: '<REGION HERE>'
        Tags:
            - Key: MyTag
              Value: !Sub '${AWS::StackName}-velociraptor-ebs-volume'

TimeSketch

The following will explore the CloudFormation template used to deploy timesketch. I’ve broken the template into section with explanations of each. Check out my GitHub page for the full template.

Parameters

The parameters for timesketch will seem familiar wtih a few adjustments to InstanceType for sizing requirements and EBS volume sizes. The only addition here is the “parameterTimeSketchDomain” parameter. Given we don’t have a configuration file that needs to match a domain name I allowed for nomination of a subdomain name.

parameterSshKeyName:
    Description: Amazon EC2 Key Pair for TimeSketch instance
    Type: "AWS::EC2::KeyPair::KeyName"

parameterVpcID:
    Description: VPC ID to deploy TimeSketch instance to
    Type: "AWS::EC2::VPC::Id"

parameterSubnetID:
    Description: Subnet ID to deploy TimeSketch instance to
    Type: "AWS::EC2::Subnet::Id"

parameterInstanceType:
    Description: TimeSketch EC2 instance types
    Type: String
    Default: t2.large
    AllowedValues:
        - t2.large
        - t2.2xlarge

parameterEBSSize:
    Description: TimeSketch EC2 EBS Volume Size
    Type: Number
    Default: 500 # Change depending on size of environment
    
parameterTimeSketchDomain:
    Type: String
    Description: Enter the subdomain name for the stack you want mapped to the TimeSketch Instance
    Default: "ts"

IAM Policy

Again the IAM policy is partly the same as what was used for velociraptor. When replicating this configuration careful consideration should be put towards the “Resource” field as this is where the target bucket is determined. This should be restricted only to your specific timesketch bucket or folder. The below uses * as a wild card match for demonstration purposes.

It should be noted that if you are following along with the Breaches Be Crazy post you will need AWS permissions for scripts on the timesketch EC2 isntance to copy artifacts from s3. I’ve used the same key and secret previously setup in the Velociraptor stack. It’s highly recommended these are broken out into separate credential.

resourceTimeSketchS3BucketsRole:
    Type: AWS::IAM::Role
    Properties:
        AssumeRolePolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Principal: 
                Service:
                - ec2.amazonaws.com
              Action:
                - sts:AssumeRole
        Path: "/"

resourceTimeSketchS3BucketsInstanceProfile:
    Type: AWS::IAM::InstanceProfile
    Properties:
        Path: "/"
        Roles:
        - Ref: resourceTimeSketchS3BucketsRole
        

resourceTimeSketchS3BucketsPolicy:
    Type: "AWS::IAM::Policy"
    Properties:
      PolicyName: TimeSketchS3BucketsPolicy
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          -
            Action: s3:GetObject
            Effect: Allow
            Resource: "*" #Important! Restrict to your bucket here.

          -
            Action: s3:ListBucket
            Effect: Allow
            Resource: "*" #Important! Restrict to your bucket here.
            
          -
            Action: s3:PutObject
            Effect: Allow
            Resource: "*" #Important! Restrict to your bucket here.               
      Roles:
            - !Ref resourceTimeSketchS3BucketsRole

Security Groups

Security Groups have been kept the same as Velociraptors ones for testing purposes however I’d strongly recommend these be revised. A default installation of timesketch wont have TLS enabled and only serve on HTTP.

It is advisable that, at a minimum, external access is restricted to a specific public IP or preferably only allowed via private IP (in my case to AWS client VPN users). The same can be done for Velociraptor with exceptions added for Let’s Encrypt and GitHub interaction.

Alternatively AWS Session Manager could be deployed for access.

 resourcePublicAccessSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
        VpcId: !Ref parameterVpcID
        GroupDescription: Allows SSH, HTTP, and HTTPS access to TimeSketch instance
        GroupName: !Sub '${AWS::StackName}-timesketch-ssh-access'
        SecurityGroupIngress:
            -
                CidrIp: '0.0.0.0/0'
                Description: 'Replace to allow for temp management' # Change and acess via clientVPN - Left for lab purposes
                FromPort: 22
                ToPort: 22
                IpProtocol: tcp        
            -
                CidrIp: '0.0.0.0/0'
                Description: 'HTTPS web GUI' #Only works if TLS has been setup
                FromPort: 443
                ToPort: 443
                IpProtocol: tcp    
            -
                CidrIp: '0.0.0.0/0'
                Description: 'HTTP web GUI' #Not Recommended, Recommmended to setup TLS with LetsEncrypt
                FromPort: 80
                ToPort: 80
                IpProtocol: tcp                        
        Tags:
            -
                Key: Name
                Value: !Sub '${AWS::StackName}-timesketch-ssh-access'

EC2 Resource

The key differentiation between the below and the Velociraptor instance is the AMI type (Ubuntu Server 20.04 , recommended by timesketch) and the userData section or commands we run on instance start.

The commands perform the following actions:

Initialise the EBS volume and mount to /data.
Update and install the required packages.
Download Docker pre-requsites.
Download the timesketch deployment script from GitHub.
Adjust permissions on the timesketch script.
Execute the deployment script using echo to deal with a prompt during deployment.
Initalise timesketch.

Again, another way of doing this could be to create post installation scripts stored in a code repository or S3 bucket to be called and executed during startup. Additionally you could add configuration to deal with server restarts both intended and unintended. I’ve neglected this as this is purely a lab deployment.

resourceTimeSketch:
    Type: AWS::EC2::Instance
    Properties:
        AvailabilityZone: 'ap-southeast-2a'
        PrivateDnsNameOptions:
            HostnameType: resource-name
        IamInstanceProfile: !Ref resourceTimeSketchS3BucketsInstanceProfile
        ImageId: ami-0df4b2961410d4cff #default Ubuntu AMI - update if required
        KeyName: !Sub '${parameterSSHKeyName}'
        InstanceType: !Sub '${parameterInstanceType}'
        UserData:
            Fn::Base64:
                Fn::Join: [ "", [
                    "#!/bin/bash -xe",
                    "\n",
                    "sudo sleep 60",
                    "\n",
                    "sudo mkfs -t xfs /dev/xvdf",
                    "\n",
                    "sudo mkdir /data",
                    "\n",
                    "sudo mount /dev/xvdf /data",
                    "\n",
                    "sudo mkdir /data/timesketch",
                    "\n",
                    "cd /data/timesketch",
                    "\n",
                    "sudo apt-get update -y",
                    "\n",
                    "sudo apt-get install ca-certificates curl gnupg  -y",
                    "\n",                        
                    "sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg",
                    "\n",
                    "sudo chmod a+r /etc/apt/keyrings/docker.gpg",
                    "\n",
                    'sudo echo "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null',
                    "\n",
                    "sudo apt-get update -y",
                    "\n",
                    "sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y",
                    "\n",
                    "sudo sudo wget https://raw.githubusercontent.com/google/timesketch/master/contrib/deploy_timesketch.sh -P /data/timesketch/",
                    "\n",
                    "sudo chmod 755 /data/timesketch/deploy_timesketch.sh",
                    "\n",                        
                    'sudo echo "no" | sudo /data/timesketch/deploy_timesketch.sh',
                    "\n",
                    "cd /data/timesketch/timesketch",
                    "\n",
                    "sudo docker compose up -d",
                    ]
                  ]

        NetworkInterfaces:
            -
                AssociatePublicIpAddress: true # this can be removed if only deploying to internal users
                DeviceIndex: "0"
                GroupSet:
                    - !Ref resourcePublicAccessSecurityGroup
                SubnetId: !Ref parameterSubnetID
                
        Tags:
            -
                Key: Name
                Value: !Sub '${AWS::StackName}-timesketch'
                
        Volumes:
            -
                Device: "/dev/sdf"
                VolumeId: !Ref resourceNewVolume

resourceNewVolume:
    Type: AWS::EC2::Volume
    Properties:
        Size: !Ref parameterEBSSize
        AvailabilityZone: '<REGION HERE>'
        Tags:
            - Key: Name
              Value: !Sub '${AWS::StackName}-timesketch-ebs-volume'

Artifact collection and processing

Now that we’ve got our infrastructure up and running we can focus on the original goal of the acquisition and processing pipeline we intended to get working. The below sections explore each component of this pipeline and walks through setup and usage.

The Pipeline used was from Breaches Be Crazy, check out the GitHub page for additional info. I’ve made slight modifications to their resources provided to account for changes I’ve made when deploying this with CloudFormation and in AWS.

When working through deployment of this pipeline I ended up leaving some parts requiring manual intervention. With a little bit of commandline magic these can be automated as part of post installation scripts triggered through CloudFormation.

The pipeline consists for several components:

Velociraptor Server - Velociraptor S3 Artifact - Customer server monitoring artifact to move our KAPE triage artifact from Velociraptor to an S3 (We will use a built in one).
TimeSketch Server - S3 Watcher - Script running to watch the S3 bucket for zip files and download them for processing.
TimeSketch Server - Processor and Ingest - Script running to watch for zip files downloaded to the timesketch server, unzip them, process them using log2timeline.py then import into timesketch.
TimeSketch Server - Plaso Backup - Script to upload processed .plaso files (output from log2timeline.py) to S3 for backup.

The below assumes you are familar with acquisition using Velociraptor and will rely on a basic KAPE collection as the initial data for processing. If you need some help getting KAPE collection working check out this documentation from the Velociraptor project.

The following depicts the the pipeline and various transfer, processing and ingestion components.

Velociraptor S3 Artifact

Before we can actually run a KAPE collection in Velociraptor we need to setup a server monitoring artifact to help us get our collection from Velociraptor into S3 where timesketch can fetch it. During original creation of the Breaches be crazy content a custom server artifact was created and can be found on their GitHub repo. This has since been formalised and now pushed with the latest deployments of Velociraptor.

The server monitoring artifact that we want to setup is called Server.Utils.BackupS3. It takes several parameters including a AWS region, access key and secret and S3 bucket name. We should already have an S3 bucket setup, the access key and secret were created as part of the CloudFormation template and can be found in secrets manager. The access key and secret were setup in our CloudFormation template, only specific interactions are allowed with the specified s3 bucket.

In an effort to add to the automation pipeline you can run the below commands to automatically enable the server monitoring artifact, once complete it will be awaiting KAPE collections and upload them to the nominated bucket as they are obtained.

I indented to run these commands passing the variables for key and secret directly in via CloudFormation however I wasn’t able to get this working cleanly without running them through additional bash scripts with cmdline variables.

An idea to further build on this would be to run these commands using CloudFormation directly inputting the bucketname, access key and secret in as they are created using !Sub variables.

velociraptor --config server.config.yaml config api_client --name api_account --role administrator api.config.yaml
sudo yum install pip -y
sudo pip install pyvelociraptor

sudo pyvelociraptor --config /data/velociraptor/api.config.yaml  'SELECT add_server_monitoring(artifact="Server.Utils.BackupS3", parameters=dict(ArtifactnameRegex="Windows.KapeFiles.Targets",Bucket="s3-velociraptor-prod",Region="<REGION HERE>",CredentialsKey="<KEY HERE>",CredentialsSecret="<SECRET HERE>)) from scope()'

You can verify the artifact is enabled and view relevant logs through the server artifacts section within the Velociraptor GUI. You can also view logs in the same spot, as I found out during this stage the logs provided were invaluable for troubleshooting errors stemming from IAM role permissions.

If all goes to plan you should get a file like the below in your S3 bucket. Names (based on Velociraptor flowid) and time will be different. If you don’t end up getting zip file uploaded to S3 after a KAPE collection check out the Server Events and associated log files within the “Server Events” page.

Check your IAM roles, S3 bucket name and regions are correct in your server monitoring artifact configured on Velociraptor.

S3 Watcher

Deployment of the watcher will be as per the Breaches be Crazy script watch-s3-to-timesketch.py with slight modification as we have setup an EBS volume and mounted at /data. The script also has to have the bucket name to match where our KAPE triage artifact was uploaded to. Once modified this can be placed to then run the deploy.sh script for deployment as a service. Depending on where you set your mount point for the EBS volume the directory will need to be modified to match.

The goal of this script is to watch the s3 bucket for new file changes(upload of our KAPE collection. When a new file is upload it will download it to /data/timesketch/upload based on our modification.

Modify the following lines:

#Add your bucket name here
bucket_name = '<Bucket Name Here>'

#change to folder structer here to match EBS volume (/data/timesketch)
s3_client.download_file(bucket_name, key.key, '/data/timesketch/upload/'+key.key)

Processor and Ingest

Crucial to the operation of this whole pipeline is the processing log2timeline.py does and the ingest into timesketch. The watch-to-timesketch.sh script takes care of this and with some minor modification we can get it working within our environment.

Depending on where you set your mount point for the EBS volume you will need to modify the script slightly as it uses the PARENT_DATA_DIR (by default set to /opt/timesketch/upload) directory as the initial watch point for the .zip file coming from the s3 watch script.

When looking at the script you may also notice the log2timeline command being run targeting “/usr/share/timesketch/upload” as the imput and output folder, the same thing is being done in the ingest commands. When you install timesketch it creates a folder called “upload” on the native file system in its install path. Docker has access to this folder and mounts it at “/usr/share/timesketch/upload “.

This means anything you put in /data/timesketch/upload (in our case due to the EBS volume being mounted at /data will be accessible via the deployed docker container.

Based on that info, all we have to change in be below line to match our EBC volume mount point

#location updated to match ebs mount point
PARENT_DATA_DIR="/data/timesketch/upload"

An additional difference is in our acquisition and upload phase where we use the default Server artifact. In the script the naming of the folder relies on the label derived from Velociraptor and used within the filename. For our filename “Host computer F.CLFIE2JEQE1DS.H 2023-11-23 10:21:06 +0000 UTC.zip” we will use the flowid. Not ideal, if we want to change this we can modify the server artifact creating our own custom one.

we will change the below line to cut on “ “ (space) and use the third array value.

#This line will be modified
LABEL=$(echo $SYSTEM|cut -d"_" -f 4)

#Modification will be as per below
LABEL=$(echo $SYSTEM|cut -d" " -f 3)

That should do it, we can now deploy this script as a service using deploy.sh.

Plaso Backup

The final script provided (watch-plaso-to-s3.sh) backs up the Plaso files to an S3 bucket. This is optional however recommended in a production environment. Again some script manipulation may be requires depending on the name of your mount point.

#directory will need to bne change to match ebs volume / mount point setup
PARENT_DATA_DIR="/opt/timesketch/upload/plaso_complete"

#S3 ucket will need to be populated with your own bucket name
BUCKET_NAME=""

timesketch Services

The final piece is the handy deploy script and service files Eric and Whitney have built out. This will install the scripts as services for persistence, drop them in the correct location for reference and create the folders required for all this to work. Once again we change the folder name to align with our mount point. Find and replace all /opt with data and we should be good to go.

With that our setup is complete, we are now able to run KAPE collections on endpoints using Velociraptor and the auto ingest pipeline with bring them into timesketch for consumption. In a future post I’ll showcase the pipeline in action and capabilities of timesketch.

That’s all for now. Happy Hunting!