Using machine learning to solve your dark data nightmare

0
161

Selling children’s data: The latest dark-web trend
TechRepublic’s Karen Roby sits down wit ZDNet’s Danny Palmer to learn more about how cyber criminals are stealing children’s data and what parents can be doing to prevent their child from becoming a victim. Read more: https://zd.net/2DiR6rT

One important point he makes is that this is a world of what he calls “small data”. Big data is of the order of terabytes of information, not 500 or so contracts or NDAs. Small teams, so Paoli thinks, need small algorithms, their own machine-learning models. It’s actually essential for them, as not only are lowest common denominator approaches unreliable, they’re a possible vector for information leaks. If a model is yours and yours alone, it can be secured and can’t be used by an attacker to infer your document structures.

If something like this is to be successful it also needs to operate inside several key constraints: it can’t need expensive consultants to get working, and it can’t be expensive to run. Paoli characterizes his possible audience as individuals and small teams, like the public defender who has too many documents and too many forms to fill in to manage their case load effectively.

So why now, when we’ve tried to deliver some of this idea so many times over the last few decades? Paoli sees it as the point where acceptance of the cloud means it’s easy for businesses to pick up a new tool that takes advantage of cloud compute to deliver results faster and more accurately than on-premises software and hardware.

SEE: Sensor’d enterprise: IoT, ML, and big data (ZDNet special report) | Download the report as a PDF (TechRepublic)

The Docugami team is certainly well suited to the task at hand, with an application development team that comes from Office and Windows (including many of the original creators of Microsoft’s form management tool InfoPath), and a pure science team that mixes XML and machine-learning skills, as well as human/machine-learning interfaces. It’s an interesting approach to working with documents, mixing natural language processing and evolutionary machine-learning skills with a deep enterprise history.

With a public beta still some time away, and much of the technical detail still being kept secret, it’s going to be interesting to watch what Paoli and his team come up with.

We live in a world of documents.

Soon this may be one a machine helped me make.

Artificial Intelligence

Xilinx refines AI chips strategy: It’s not just the neural network

Intel’s Mobileye chief bemoans tweaking of AI, talks up MaaS, moving beyond LIDAR

SoftBank Group looking to ride AI unicorns into the future

Uber vs. Lyft: How the rivals approach cloud, AI, and machine learning

AI in Healthcare: Saving lives at population scale (CNET)

AI will eliminate 1 of 8 jobs in Asia by 2024 (TechRepublic)

Related Topics:

Enterprise Software

Smart Office

CXO

SMBs

Tech Industry