- 2 min read

xxx

On this page
Introduction

Semester 7

I created intrusion detection system IDS using machine learning algorithms. It was easy I started with zero AI knowledge. Good project if you want to combine AI with cyber security.

Semester 8 - Graduation internship

This is more of an AI project than a cybersecurity project. I wanted to create a private chatbot like ChatGPT that can run fully locally. Choose this project if you want to switch from cybersecurity to AI, which I don't think is your plan.

The project was highly technical so I don't expect you to understand all of it. It took me 18 weeks to finish it so it's really difficult.

The research involved 3 research questions:

1- What is the best open-source large language model.

2- What is the hardware needed to run the model locally.

3- What is the best backend and frontend libraries.

Here are the answers:

1- The best open-source large language model is Llama 3.3 70b https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

2- It can be run on 2 Nvidia RTX 4090 when quantized to 4-bits.

3- Frontend: Open Web UI (https://docs.openwebui.com/) Backend: vLLM(https://docs.vllm.ai/en/latest/)

Text from project plan:

Context

Secura is a leading and independent expert in digital security, established in 2000 in the Netherlands. Their customer markets range from government and healthcare to finance and industry. Secura offers technical services, such as vulnerability assessments, penetration testing and red teaming. Secura also provides certification for IoT and Industrial (OT/ICS) environments, as well as audits, advisory services and awareness training. Secura’s goal: raising your cyber resilience. Since 2021, Secura is a Bureau Veritas (BV) company, meaning that Bureau Veritas is a majority shareholder. Bureau Veritas is a publicly listed company (Euronext: BVI) that specializes in testing, inspection and certification. BV was founded in 1828, has over 75.000 employees and is active in 140 countries. Secura is the cornerstone of the cybersecurity strategy of Bureau Veritas. For more information, please visit: https://www.secura.com/

Identified Problems

- - - -

  • Volume and Complexity: Secura has a vast repository of penetration testing reports. Manually searching through these documents is time-consuming and inefficient.
  • Productivity Impact: Employees spend considerable time finding information, which impacts productivity and limits focus on strategic tasks.
  • Technical Nature of Reports: Penetration testing reports are highly technical and contain specific jargon, making accurate translation challenging.
  • Human Translation Limitations: Manual translation is time-consuming, prone to errors, and can compromise the technical integrity and confidentiality of reports.
  • Client Communication: Miscommunications due to translation errors can lead to misunderstandings and reduced client trust.

By addressing the identified problems with the development of the RAG chatbot model and the penetration testing report translation model, Secura aims to significantly enhance its operational efficiency and client satisfaction. Design Challenge Deploy a large language model (LLM) infrastructure (backend + frontend) that enables Secura analysts in penetration testing to efficiently query reports and translate confidential reports into multiple European languages with high accuracy, low latency, and GDPR compliance.

What would I choose if graduating again.

I would learn how to fine tune a large language model to help me at some cyber security tasks like writing rules for a firewall or analyzing private network data.