Where You Put Your AI Data Matters

The growing importance of data sovereignty

A new dimension is being added to discussions of artificial intelligence for many  companies—data sovereignty and governance. Data volume will always be a consideration; handling terabytes of data significantly affects the total cost of operations. But, as thinking shifts from data volume to how AI can be applied, organizations are looking at risk and security, resiliency, and how data integrity influences the value of AI workloads developed specifically for their business.

As Senior Vice President of IT and Digitization, I help set and manage digital technology, automation, and business transformation strategies for CoreSite. I’ve been with the company since 2008, in an ideal position to witness how digital transformation and disruptive innovations like e-commerce, smartphones and AI have shaped customer needs and data center evolution.

Our customers recognize the leaps in efficiency, productivity, and revenue AI could bring. They also are mindful that the rush to AI seems reminiscent of the rush to the cloud. While they want to move fast, the lessons learned by going “all in” on the cloud—unpredictable costs, vendor lock-in, compliance complexities, security risks—point to a strategy based on incremental, continuous improvement, which can yield great rewards.

Time has proven that where you put your infrastructure matters. Burgeoning awareness of the details around the location and transport of AI data signals that where you put your AI data matters as well.

Aleks Krusko, Senior Vice President of Information Technology and Digitization, CoreSite

Getting Started With AI

Enterprises are anxious to implement AI. The question is: Where to start? I suggest that chatbots are a viable initial use case for many companies. They offer an AI application that should elevate customer experience and employee productivity, while leveraging enterprise-specific knowledge and assets without necessarily interoperating with (and impacting) core business processes.

Chatbots rely on large language models (LLM) and are trained and utilize natural language processing (NLP) and machine learning (ML) algorithms. In case you are not familiar with AI chatbots, it’s important to know that an LLM not only interprets words, NLP enables identification of sentiment such as anger or delight. Chatbots also automatically continue training the LLM model (that’s the ML part) when interacting with customers, so they get smarter as they gain experience. That data helps the company know more about customers’ challenges with products, preferences for interaction and possible product enhancements.

AI chatbots are a double-edged sword, giving companies the ability to do amazing things and also giving nefarious actors great ability to corrupt those amazing things, leveraging AI. Enterprises developing chatbots need to be vigilant with the data fed into the LLM to protect intellectual property from being exposed. They also need to rigorously test and monitor chatbot software for vulnerabilities, so hackers can’t eavesdrop on customer/enterprise conversations, intercept data or inject viruses or malicious bots into the system.

The pluses of the double-edged sword? Chatbots can answer the large majority of routine questions. When a chatbot isn’t up to the task, the company can provide customer experiences that only a human can deliver, using what’s learned during the online interaction with the customer to transfer them to representatives with relevant knowledge.

Also, consider the customer service representative’s point of view. Turnover in call centers is a well-known challenge. People naturally feel better about their role, and stay in that role, when they solve difficult problems for other people —that feels good, instead of grinding through yet another routine solution that a chatbot can handle.

Wrangling Data and Corralling Risk

Disruptive innovations initially create a “wild, wild west” technological environment; AI is no exception. OpenAI was adopted at a record pace. Today, according to AI software developer Vention, more than 80 percent of businesses have embraced AI to some extent, viewing it as a core technology within their organizations. That same report shows that 76 percent of business leaders find implementing AI challenging due to lack of data expertise, undefined KPIs, poor data quality, regulatory hurdles or reluctance of employees to buy in.

I’ll add data governance to the list—especially in light of the risks OpenAI can create. One example of that is when employees upload proprietary information or data into ChatGPT, unaware that the assets are subsequently part of the public domain. Many organizations, including CoreSite, are now utilizing private GenAI-powered tools to execute the type of tasks OpenAI enables and keep information inside enterprise firewalls.

DORA, the Digital Operational Resilience Act, and the EU AI Act are examples of how the wild west can be reined-in relevant to an industry and on a broad scale. DORA is specific to financial services organizations, providing data and network security guidelines for financial transactions. The EU AI Act categorizes AI systems based on the level of risk systems pose and sets mandates for AI governance, AI risk and AI compliance.

The Devil in the AI Data Details

We text, game, stream music and movies, post on social media, etc., without a second thought about the data generated or interconnection it takes. Awareness is rising, however, as AI continues to weave its way into the activities of our digital lives and digital businesses. The hype cycle is waning; in its place, stakeholders are turning attention to the elements of AI data critical to its value. Let’s look at a few of those.

  • Residency: Where data is stored is a long-standing concern based on the need for privacy and intellectual property protection. This is essential for compliance with laws like the General Data Protection Regulation (GDPR), the European Union law that empowers individuals with rights over their personal data, and extends to enterprise AI systems, which can contain sensitive information trained into AI models.
  • Accuracy: You’ve heard about AI models hallucinating, delivering inferences that are obviously (or not obviously) inaccurate. Don’t blame the messenger; dependable recommendations or responses to AI queries hinge on models being trained with accurate data.
  • Availability: Access to data is top priority. Also, availability and residency are not the same thing. Where the cloud (actually, the server) is physically and where the data is generated and acted upon determines latency. For many applications, low latency is or will be the difference in performance and usefulness. Discussion of the edge has taken a backseat to AI. However, as the idea of “taking AI to the data, not data to AI” gains traction, I think the edge will again become part of the conversation.
  • Integrity: Part of accuracy, data needs to be complete and consistent to be usable. It’s also paramount to the quality and reliability of a model’s output (inferences).
  • Security/Encryption: Data at rest and data in motion need to be available only to trusted stakeholders. Encryption reduces the risk of breaches and potential for ransomware attacks, regardless of the state of the data.

How Data Centers Can Make a Difference

Data center providers, in general, can help with many of these requirements. I’ll take a moment to address how CoreSite can help regarding a few of the bullets above.

CoreSite offers direct public cloud connections and can leverage those on-net onramps to facilitate ultra-low-latency data ingress and lower-cost data egress, with “gold standard” security. Data integrity is impacted by the quality of network performance and enterprise-to-enterprise interconnections. When performance is sub-par, data can be lost or corrupted—significant issues for data accuracy. Furthermore, if broadband capacity is inadequate, latency-sensitive AI applications will be unable to execute.

Of course, there is, and will be, more to consider as AI continues to mature. It will be exciting to be a part of the process, especially because so many new services and business innovations are being developed based on these latest advances.

I say it will be exciting to be part of the process because CoreSite’s role in AI is that of enabler. I’m often amazed by what our customers are doing with AI, and the advantages for non-AI use cases created in its wake.

We factor all the above requirements into the data centers and colocation services we offer and now, with the AI data customers need to bring into their infrastructure, we are looking at ways we can enable AI even better. I’m happy to say that the facilities are more than ready for it. We’ve been helping customers with high-density solutions for quite some time now, which includes delivering the power and cooling needed (including liquid cooling), as well as the network capabilities. In my 16 years at the company, I’ve learned that it’s wise to consistently evaluate the technologies we use. Although it can be difficult to admit, sometimes you need to take a small loss to realize a big gain, and sometimes you need to slow down to speed up. I urge you to keep that in mind as you look at the sea of AI data you need to cope with and envision its potential for your company.