Transforming Alteon Support with AI-Powered Support-File Analysis - Part 1


Introduction

In today's complex networking environments, troubleshooting and resolving issues quickly is critical to maintaining service availability. For Alteon Application Delivery Controllers, the Technical Support File (TSDMP) is a comprehensive log file that plays a pivotal role in identifying and detecting issues. However, analyzing these files presents unique challenges even for experienced engineers.

This article explores some of the key challenges and design decisions that went behind building an enterprise-grade Retrieval-Augmented Generation (RAG) (Zhao et al. 2024) solution to support Alteon TSDMP analysis and transform a resource intensive and expertise-heavy analysis process into a fully automated, 10-minute workflow that delivers expert-level insights.

Transforming Alteon Support with AI-Powered Support-File Analysis. Part 1. Blog image

About TSDMP Files

Alteon TSDMP files contain a wealth of device and network traffic information - system logs, configuration data, statistics, counters, debug logs, system states, and error messages. Over time, with Alteon's extensive market history, these files have naturally evolved, with each successive product version adding diagnostic layers to support new features and emerging troubleshooting requirements.

While this growth in scope and depth of the TSDMP files reflects the product's maturity and deep diagnostic capabilities that support teams rely upon, it also creates some analytical challenges. In its current modern state, TSDMP files can range anywhere from 6,000 to 200,000 lines of technical data. The dense and comprehensive nature of these files, while invaluable for thorough diagnostics, requires expertise to navigate effectively. And even experienced engineers can find extracting relevant insights to be time-intensive. These characteristics, although common to mature enterprise platforms with extensive feature sets, can contribute to longer root cause analysis, impacting Mean Time To Detection (MTTD) and Mean Time To Resolution (MTTR).

Why Traditional RAG Falls Short for TSDMP Analysis

RAG has become the go-to architecture for building AI systems that need to process and analyze domain-specific documents. At its core, RAG combines the power of large language models with external knowledge retrieval, enabling systems to provide accurate, contextual responses based on specific datasets rather than just pre-trained parametric knowledge.

Figure 1: Generic RAG architecture

Figure 1: Generic RAG architecture (Zhao et al. 2024 https://arxiv.org/pdf/2402.19473v1)

1. Domain Expertise Gap

However, the challenge of analyzing technical support files from specialized and proprietary networking and cyber-security devices, like the Alteons is unique. These files contain specialized jargon, proprietary software terminology, hardware-specific subsystem logs, and diagnostic patterns that have typically never appeared in any public dataset, and thus they represent a knowledge domain that diverges quite significantly from standard IT/networking troubleshooting data or data from ubiquitous vendors like Cisco, which the LLM might have been exposed to during training. Since an LLM relies on its parametric knowledge to contextualize, interpret, and reason about the retrieved content, language models tend to lack underlying conceptual patterns to interpret domain-specific terminology, relationships, and patterns and perform inadequately when there is a strong divergence between retrieved content and the model’s training distributions

2. Scale & Cost Challenge

TSDMP files vary dramatically in size, largely determined by the Alteon platform that produced them (VX, Standalone, vADC, or VA). In the field, we see anything from ≈ 6,000 to ≈ 200,000 lines per file, with a median around 140,000 lines. This amounts to a range of ~140K to ~1.3M input tokens without even considering additional prompt tokens for optimal in-context learning. Adopting best in class prompting techniques to elicit reasoning traces in LLM responses, increases the number of output tokens to roughly the number of input tokens that were consumed.

Using as an example the current Claude 3.5 Sonnet rates, this footprint translates to ≈ US $0.90 for the smallest files, up to ≈ US $30 for the largest, with the median landing near US $15.

In short, simply streaming raw TSDMP files through an LLM is both expensive and operationally unpredictable—a serious concern when these files are artefacts that support engineers rely on daily to diagnose and troubleshoot customer’s issues. If the support queue were to run dozens of these files every day, the spend would escalate quickly and make it economically impractical for regular use in a support environment—a problem we need to solve.

3. Heterogeneous Composition: Why a “Just‑Pass‑the‑File” Strategy Fails

A TSDMP file is not a single, homogeneous document. Its contents are heterogeneous, roughly amounting to 200 vastly different technical sections. To an LLM, each section requires its own special guidance to steer to the model towards the appropriate next token probability distribution and effective in-context learning (Brown et al. 2020).

Because of this heterogeneity, the classic RAG tactic of “attach the whole file to the prompt and ask the model to diagnose” breaks down:

  • No single instruction works for all sections. For example, the instructional heuristics that help an LLM reason about health check utilization debug prints are useless when it encounters Acceleration Engine debug dumps.
  • Context-window limits are quickly exceeded. A number of state-of-the-art models top out at 200 k tokens (such as Anthropic Claude Sonnet models), while a large TSDMP bundle can easily surpass that limit on its own.
  • Signal-to-noise ratio collapses. Mixing unrelated sections dilutes the statistical cues the model needs to make domain-specific insights, leading to vague or incorrect answers.

In the next section, we will delve into the implementation insights and the impact of our AI-powered solution for TSDMP analysis. We will explore the multi-stage preprocessing engine, routing layer, augment and generation schema, and the overall impact and results of our approach. Stay tuned to learn how these components come together to transform the TSDMP analysis process into a highly efficient and automated workflow.


References

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., ... & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, 24824-24837.

Yu, W., Zhang, H., Pan, X., Ma, K., Wang, H., & Yu, D. (2023). Chain-of-note: Enhancing robustness in retrieval-augmented language models. arXiv preprint arXiv:2311.09210.

Zhou, Y., Geng, X., Shen, T., Tao, C., Long, G., Lou, J. G., & Shen, J. (2023). Thread of thought unraveling chaotic contexts. arXiv preprint arXiv:2311.08734.

Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., ... & Cui, B. Retrieval-augmented generation for ai-generated content: A survey. arXiv 2024. arXiv preprint arXiv:2402.19473.

Arnab Chatterjee

Arnab Chatterjee

Arnab Chatterjee is a Principal Tier 4 Engineer at Radware with over 15 years of experience in network engineering and application security. He has served as Radware’s top-level escalation point for critical issues, leading complex engineering initiatives and driving innovative solutions. Arnab holds a Master’s degree in Electrical Engineering from the New Jersey Institute of Technology and a Master’s degree in Philosophy from the University of Edinburgh.

Contact Radware Sales

Our experts will answer your questions, assess your needs, and help you understand which products are best for your business.

Already a Customer?

We’re ready to help, whether you need support, additional services, or answers to your questions about our products and solutions.

Locations
Get Answers Now from KnowledgeBase
Get Free Online Product Training
Engage with Radware Technical Support
Join the Radware Customer Program

Get Social

Connect with experts and join the conversation about Radware technologies.

Blog
Security Research Center
CyberPedia