SOGETI UK BLOG

I distinctly remember the first time someone, my mother to be exact, asked me what I wanted to do when I was all grown up. I also clearly remember the answer: I wanted to become a wizard.

My mother, of course asked me if I meant a magician but no, I was quite certain that I wanted to become a wizard. Of course, I also had a nagging suspicion it wasn’t going to work out, knowing the scarcity of wizards inhabiting the world.

After giving it some thought, I settled on wanting to be an inventor. Coupled with my interest in electronics, I decided that that’s where my future would lie.

That was, until I got my first PC. A Tulip brand with dual 5.25 inch floppies and I then saw that in the world of technology is where the real magic lies.

Technology is magical

Picture of a person typing at a keyboard with matrix effect put over

Admittedly, PC’s were a bit simpler back then. The traces on the motherboard were thick enough you could repair them yourself if they somehow got damaged. The games I played had a maximum of four colours on the screen at a time. The sound consisted mainly of blips and bloops from the internal PC speaker. And you could trash your hard-drive by not issuing the “PARK” command before shutting it down. Something I admit to having learned the hard way.

They were simpler machines with simpler software, not unlike the current IoT wave of cheap devices with limited capabilities. A perfect blend between hardware and software.

But for me, they were absolutely magical. By issuing a simple command the machine would obey and change reality. Nothing underlined that more than the adventure games of the time which were driven completely by text commands, with Kings Quest 3 being the first of many I played. Incidentally, that’s how I learned to write English long before I enrolled myself to English language classes. Me, the game and a dictionary on my lap to figure out what command to issue next.

Technology is the future

After working in the field of IT for some years, that sense of wonderment has never left me. I’ve seen colleagues leave the field after being disillusioned by projects with impossible deadlines and a shoe-string budget. Customers that seemingly change their mind the moment the code is done and the ever-changing requirements of projects.

But for me that is where the magic happens – without these obstacles, I doubt we’d ever manage to really create magic. You only need to look at what’s happening in the world of technology today: self-driving cars, useable AI on a device that fits in your pocket and a connected world where things, both mechanical and biological, needn’t break down but can be repaired before they fail.

But for me that’s not the whole story, for me technology is also about being human. Humans are inherently a communicative species; we just love to talk to one another and now we have working holographic technology, universal translators and global communication networks. Connecting the whole world, breaking down barriers, helping the environment and generally making it a better place to live in.

All this was once the stuff of science fiction and today we’ve taken the first steps to making it a reality. Who knows what tomorrow brings? I have no idea, but I can’t wait to see what it brings when it arrives.

AUTHOR:

Posted in: Behaviour Driven Development, communication, Developers, Digital, e-Commerce, High Tech, Human Behaviour, Innovation, Internet of Things, Open Innovation, Opinion, Technology Outlook, Transformation      
Comments: 0
Tags: , , , , , , , , , , , , , , , , ,

 

When I was young, circa 1970, my mother used to close the curtains in the evening. Our privacy was important to her; no need for neighbours to be able to look inside our home.

3rdoct-1

Things have changed.

In the 80’s the discussion on privacy was impacted by the digitalization of information. The control on our personal data decreased. Laws and regulations were made to protect privacy. Organizations were forced to be transparent about the way they were using personal data. Not as strong as a closed curtain, but OK, we could live with it.

Internet in the 90’s made it even more complex and since then we seem to be running behind the developments. Our behaviour on internet was per definition not hidden. A relatively small group of people started to get seriously worried but the majority, on the contrary, started to publish personal data on social media, just for fun. We didn’t only open the curtains but even started to shine lights on our private lives. The more people watched it, the happier we were.
But still, if you did not want to be visible and transparent, you did not have to. Discussions on privacy shifted to cookies and how they were commercially used.

The smartphone made it even more difficult to keep the curtain closed. Traceability of your whereabouts is almost inherent on a smartphone. Discussion on privacy was often as a trade off with security “If you have nothing to hide, you should not bother”.

3rdoct

And last but not least, we are entering the world of the Internet of Things. Living in a connected house, driving in a connected car or a connected bicycle with wearable’s on our body through a smart city to our connected office: there will be no more escaping. To close the curtain you need to be some kind of an outcast, practically impossible.

3rdoct-3

Some people conclude privacy is dead. If we look at the privacy we had in the 70’s: yes, that is dead.

In his talk “why privacy matters” journalist Glenn Greenwald makes clear that we are on a scary path.

He describes a panopticon (a design mainly used in prisons): an architectural design which facilitates a central authority with the ability to view any inmate at any time, without them knowing. The idea of the design is to ensure that prison mates are aware that they may be being watched and therefore their behaviour remains consistent. Modern shares characteristics of the panapticon – everyone behaves differently when you know you may be watched by others. A society in which people can be monitored at every moment is a society which creates conformity, obedience and submission. It creates a prison in our minds.

The way forward is not an easy one but the discussion on privacy should go far beyond cookies. Instead it should be much more about the essence of what privacy actually is and how we can prevent our connected society from turning into a mental prison.

AUTHOR:

Posted in: Digital, Human Behaviour, Innovation, Internet of Things, IT Security, privacy, Security, Smart, Social media      
Comments: 0
Tags: , , , , , ,

 

Details of medical laboratory, scientist hands using microscope for chemistry test samples

Big data / NoSQL Cassandra / SOLR / Natural Language Processing – Text Mining for Pre-screeening of Cancer Clinical Trials.

Cancer clinical trials are search studies that test the pertinence of a new medical treatment on cancer patients. They are key factors for medical improvement and their success depends essentially on the number of enrollments onto trials.

Pre-screening patients manually require lengthy investigations and successive matching on patients’ records during a limited phase.

Adding to this, a large amount of money spent for this phase, automating the eligibility prescreening process turns out a promising and a beneficial solution for cancer treatment.

In fact, automating this process remains an information retrieval task. Medical records, which are mainly originated from surgical pathology laboratory, constitute a rich source of unstructured data. They are written in a natural/human language which is complex and difficult for a machine to process. Dealing with such type of data requires a structuring phase for extracting useful information in order to provide the necessary knowledge to the machine, thus for translating the human language to a machine recognizable language.

Text Mining and Natural Language Processing (NLP) combined together, constitute a solid solution for representing this valuable information stored on medical records. They deal both with free text, and the main objective is to extract non-trivial knowledge from it. It encompasses everything from information retrieval to terminology extraction, text classification to spelling correction and sentiment analysis. NLP methods rely intensely on probability theory, statistics and machine learning field. It deals also with linguistics concepts, grammatical structure and the lexicon of words.

Recently, cancer research is benefiting from the Text Mining advancement and uses its theory for clinical decisions. More precisely, automating cancer clinical matching trials have been the subject for many studies and solutions which dealt with information retrieval from medical records. In fact, working with cancer data consists of covering hundreds of cancer diseases with a very large lexicon. Many medical terminologies have been constructed in order to regroup medical concepts and thus to provide a unified lexicon in the field of medical. Those libraries, mainly UMLS, SNOMED and CIMO, are a major component of Natural Language systems designed for medical field. They serve as a link between patient data and the Text Mining system in order to enrich clinical records and extract synonyms for medical concepts.

OVERVIEW OF OUR ALGORITHM

To get a clear view of how Text Mining and NLP can help the automation of clinical trials, let’s get in deep of the most used methods for processing natural language stored in clinical data. Firstly, we recall that the objective is to extract medical concepts and semantic types from both the clinical trial criteria datasets and patient data. NLP provides a semantic representation of the natural language sentences in order to map them to their original meaning. It uses either a rule-based algorithms or the machine learning paradigm for more complex language processing.

Most of the automated patients prescreening systems are rule-based. They are easy, fast and more preferred to deploy. Such methods perform well on simple types of information, but for complex type of data ML algorithms, although being a Black Box for clinicians, are more robust and give good performance. Rule-based models are mainly used for medical text pre-processing: tokenization, sentence parsing, redundancy removal, etc. After pre-processing the free text, an assertion detection phase is followed in order to detect negation. NLP system tries also to detect medical terms using different medical terminologies. The other approach is to use Machine Learning models for the same purpose through the analysis of a set of documents or individual sentences that have been hand annotated with the correct values to be learned. Main ML algorithms that are used for NLP are Naïve Bayes, Support Vector Machine and Random Forest… They take as an input a large set of features induced from patient’s records and try to learn rules from the annotated examples. ML methodology can also be used for learning information from the previously selected patients’ data by detecting features that explain the enrollments into previous clinical trials.

After retrieving all useful information from the unstructured data and expanding it with all possible medical hyponyms from medical ontologies, it serves as an information retrieval data source for matching patients with inclusion and exclusion clinical trials criteria. Given a cancer clinical trial and the encounter patients, the Text Mining system supplies to clinicians a restricted list of eligible patients, thus providing a significant impact in reduction of time and effort for manual pre-screening.

BIG DATA ARCHITECTURE

Due to the large volume of data to be managed, we have selected and designed a big data architecture based on Datastax. Why Datastax? Because it supports Hadoop, Spark, Cassandra and SOLR. Already ready to use. So, deploying it using MS AZURE portal, it took around 1 hour to get several nodes working and ready to use.

We imported all the data into a Cassandra database, then SOLR indexed it and we were able to perform some data exploration and search quickly.

We add synonyms coming from SNOMED and UMLS in order to be able to use synonyms search feature of SOLR. Thanks to dedicated NLP developments in PYTHON we implemented Natural Language Processing features (negation, semantic improvements, medical terms identification, stemming, etc) in order to improve prescreening process performance.

TEST PHASE IN PROGRESS

By the end of 2016, we will complete the test phase and we will add some improvements taking in account users feedback.

A new post will be published then, with the final conclusions and results.

CONCLUSION

Taking benefits of all scientific articles we were able to design a cancer clinical trials prescreening solution in French. Several products exist in English but no solution is available for France or French-speaking countries.

Business benefits offers by our solution is already obvious: by suggesting a list of patients in a few minutes to clinical trials team instead of several days of manual screening, the team can focus to confirm results proposed instead of screening tons of patients records and data.

Contributor: Bilal AZENNOUD, Data Scientist, SOGETI France

AUTHOR:

Posted in: Automation Testing, Big data, Biology, Data structure, Innovation, Quality Assurance, Requirements, Research, Socio-technical systems, Software Development, Testing and innovation      
Comments: 0
Tags: , , , , , ,

 

cloud_LOWRES

We know that cloud computing is “the new normal” just like virtualization was in the past. And we also know that the adoption of cloud computing by your organization can come with a series of benefits including:

  1. Reduced IT costs: You can reduce both CAPEX and OPEX when moving to the cloud.
  2. Scalability: In this fast changing world it is important to be able to scale up or down your solutions depending on the situation and your needs without having to purchase or install hardware or upgrades all by yourself.
  3. Business continuity: when you store data in the cloud, you ensure it is backed-up and protected which in turn helps with your continuity plan cause in the event of a crisis you’ll be able to minimize any downtime and loss of productivity.
  4. Collaboration: Cloud services allow you share files and communicate with employees and third-parties in this highly globalized world and in a timely manner.
  5. Flexibility: Cloud computing allows employees to be more flexible in their work practices cause it’s simpler to access data from home or virtually any place with an internet connection.
  6. Automatic updates: When consuming SaaS you’ll be using the latest version of the product avoiding the pain and expensive costs associated with software or hardware upgrades.

But once you ask yourself: what can possibly go wrong? You open your eyes to a “cloudy weather” where you must plan, identify, analyze, manage and control the risks associated with moving your data and operations to the cloud.

To help you with the identification process, here is a list of risks that your organization can face once you start or continue the transition to the cloud:

  1. Privacy agreement and service level agreement: You must understand the responsibilities of your cloud provider, as well as your own obligations. In some situations, is your obligation to do configure correctly the service in order to enable the best SLA possible.
  2. Regulatory compliance: Remember that although your data is residing on a provider’s cloud, you are still accountable to your customers for any security and integrity issues that may affect your data and therefore you must know the standards and procedures your provider has in place to help you mitigate your risk.
  3. Location of data: Know the location of your data and which privacy and security laws will apply to it cause it’s possible that your organization’s rights may get marginalized.
  4. Data privacy and security: Once you host confidential data in the cloud you are transferring a considerable amount of your control over data security to the provider. Ask who has access to your sensitive data and what physical and logical controls does the provider use to protect your information.
  5. Data availability and business continuity: How is your organization and the provider prepared to deal with a possible loss of internet connectivity? Weigh your tolerance level for unavailability of your data and services against the uptime SLA.
  6. Data loss and recovery: In a disaster scenario, how is your provider going to recover your data and how long will it take? Be sure to know your cloud provider’s disaster recovery capabilities and if and how they have been tested.
  7. Record retention requirements: If your business is subject to record retention requirements, how well is the cloud provider prepared to suite your needs?
  8. Environmental security: Cloud computing data centers are environments with a huge concentration of computing power, data, and users, which in turn creates a greater attack surface for bots, malware, brute force attacks, etc. Ask: how well prepared is the provider to protect your assets through access controls, vulnerability assessment, and patch and configuration management controls?
  9. Provider lockdown: What is your exit strategy in case your provider can no longer meet your requirements? Can you move your data and operations to another provider’s cloud? Are there technical issues associated with such a change?

Remember we are talking about your data and business here and once you transition to the cloud you are still accountable and responsible for what happens with it. And yes, moving to the cloud comes with a series of benefits and rewards if the associated risks are identified and well managed.

References:
http://www.cio.com/article/2409109/cloud-computing/risk-management-in-cloud-computing.html
https://www.business.qld.gov.au/business/running/technology-for-business/cloud-computing-business/cloud-computing-risks

AUTHOR:

Posted in: Cloud, Data structure, Digital strategy, Innovation, privacy, Quality Assurance, Research, Security, Software Development, Technical Testing, Virtualisation      
Comments: 0
Tags: , , , , ,

 

In an environment where the winner in any market is most often also its digital master, the question of whether your company will turn into a digital predator or prey, finds its answer in your testing abilities.

Come to TestExpo™ on 12 October at the Emirates Stadium; and listen to Andreas Sjostrom, an internationally awarded digital strategist and Vice President at Sogeti, as he links business trends to practical aspects of automated testing, omni channel, and cloud.

This conference will be co-located with Agile Expo, offering a great opportunity to network.

Get registered here: http://bit.ly/1UeUaU1

AUTHOR:

Posted in: Automation Testing, Digital strategy, Research, Test Expo, Testing and innovation      
Comments: 0
Tags: , , ,