Blog Classic - FindAI Jobs, training and advice

07Apr

Head of Data Science, AI & ML at Apex Group

Description

About Apex

Apex Group Ltd., established in Bermuda in 2003, is a global financial services provider. With over 80 offices in 70 countries worldwide and more than 12,000 employees, Apex Group delivers an expansive range of services to asset managers, financial institutions, private clients and family offices. The Group has continually improved and evolved its capabilities to offer a single-source solution through establishing the broadest range of services in the industry; including fund services, digital onboarding and bank accounts, depositary, custody, super ManCo services, corporate services including HR and Payroll and a pioneering ESG Ratings and Advisory solution. Apex Group’s purpose is to be more than just a financial services provider and is committed to driving positive change to address three core areas; the Environment and Climate Change, Women’s Empowerment and Economic Independence, Education and Social Mobility.

Life at Apex isn’t just about the work you do, it’s about embracing the culture and loving what you do. Every employee plays a part in making Apex who we are today and the more that we grow, the more important that becomes. Whatever your career path or specialism Apex ensures it rewards loyal and dedicated employees. The international nature of our business and global network of offices means that there are opportunities to broaden your life experiences and take both short-term or permanent relocation options.

Job Specification

We are seeking an experienced and visionary Head of Data Science specializing in AI and ML to lead our dynamic team of data scientists and drive our data-driven initiatives. As the Head of Data Science, you will be responsible for developing and executing our AI and ML strategy, leading the development of advanced models, AI driven products and collaborating closely with cross-functional teams to drive innovation and business growth.

Responsibilities:

AI & ML Strategy Development: Develop and execute a comprehensive AI and ML strategy aligned with organizational goals and objectives.
Team Management and Leadership: Lead and mentor teams providing guidance, coaching, and professional development opportunities to foster a high-performing culture.
Advanced Machine Learning Models: Drive the development and implementation of advanced models and algorithms to extract actionable insights from complex datasets.
Research & Development: Stay abreast of the latest advancements in AI and ML technologies and methodologies, and lead R&D efforts to continuously improve our capabilities.
Collaboration: Collaborate closely with cross-functional teams, including data engineering, product management, and business development, to integrate AI and ML solutions into our products and services.
Project Management: Oversee the end-to-end execution of data science projects, ensuring timely delivery, quality, and alignment with business objectives.
Performance Monitoring & Optimization: Establish key performance indicators (KPIs) and metrics to measure the effectiveness of AI and ML initiatives and drive continuous improvement through optimization and iteration.
Stakeholder Communication: Communicate complex technical concepts and insights to non-technical stakeholders in a clear and compelling manner and serve as a trusted advisor on data science matters.
Compliance & Ethical Considerations: Ensure compliance with relevant data privacy regulations and ethical guidelines and promote a culture of responsible AI and ML usage.

Qualifications and Skills:

Master’s or Ph.D. degree in Computer Science, Statistics, Mathematics, or a related field.
Proven track record of leadership in data science, with at least 7 years of experience in leading AI and ML teams/projects.
Deep expertise in machine learning techniques, including supervised and unsupervised learning, deep learning, reinforcement learning, etc.
Proficiency in programming languages such as Python, R, or Java, and experience with libraries/frameworks such as TensorFlow, PyTorch, scikit-learn, etc.
Strong analytical and problem-solving skills, with the ability to translate business requirements into technical solutions.
Excellent communication and interpersonal skills, with the ability to collaborate effectively with diverse stakeholders.
Experience working in Financial Services and asset management is a plus.
Familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud) and big data technologies (e.g., Hadoop, Spark) is desirable.
Demonstrated commitment to continuous learning and professional development.
Experience in implementing analytics solutions within multinational financial services organizations.
Familiarity with cloud-based analytics platforms and tools.
Previous leadership role within the financial services, technology, or consulting industry.

Join Apex Group as the Head of Data Science Ai & ML and lead the transformation of data into actionable insights and new products through leveraging advanced tools and techniques with AI & ML, shaping the future of our global financial services. Drive strategic decisions, foster innovation, and propel our organization forward by harnessing the power of analytics on an international scale.

What you will get in return:

A genuinely unique opportunity to be part of an expanding large global business. Joining Apex Group will provide you with a platform for professional and personal success and an environment where you can truly make an impact.

Our people are our greatest asset and we believe learning is central to developing talent, nurturing strong leaders, fostering a supportive company culture and ultimately drives our success.

Source link

07Apr

DATA ANALYST Performance Industrielle at La Coopération

La Coopération Agricole est la représentation unifiée des entreprises coopératives agricoles qui jouent un rôle incontournable dans l’économie agricole, agroalimentaire et agro-industrielle française. Porte-voix politique et force de proposition auprès des pouvoirs publics français et européens, des médias et de la société civile, La Coopération Agricole a pour mission de promouvoir le modèle coopératif en valorisant son action économique sur les territoires.

Créée à l’initiative des coopératives agricoles, La Coopération Agricole // Solutions+ contribue, grâce à des solutions uniques, à accélérer les transitions qu’elles doivent engager dans les prochaines années : Evolution des métiers et des compétences, Climat, Performance Industrielle et Intelligence Artificielle.

Cette mission se concrétise à travers des solutions « uniques » conçues sur-mesure, forgées par son expertise sectorielle, coconstruites avec ses partenaires et toujours à la pointe de l’innovation. L’offre de LCA // Solutions+ en matière de performance industrielle se décline à travers des solutions d’optimisation (OptiFlux, OptiVentil, OptiElec, OptiSéchage…), des missions de conseil et d’audit (Conservation des grains, optimisation énergétique, supplychain, pérennité des structures) et des formations « métiers ». Pour répondre au développement de ses solutions de la gamme des « Opti» en y intégrant l’IA, La Coopération Agricole Solutions+ recrute un(e) Data Analyst Performance Industrielle.

Missions

Le ou la candidate retenu(e) accompagnera de manière opérationnelle les coopératives agricoles dans le déploiement des solutions « Opti » en réalisant les missions suivantes :

• Recueillir et extraire des sources de données pertinentes et les traduire ensuite en données statistiques,

• Traiter, exploiter et intégrer des données en lien avec les experts « Métiers »,

• Créer des dashboards, mettre en place des KPIs et reporting des performances pour donner une vision cohérente des résultats,

• Mettre en place des process/requêtes et automatisations,

• Produire des analyses métiers et de recommandations aux utilisateurs des solutions,

• Contribuer à l’évolution des solutions en participant à la rédaction de roadmap et de cahiers des charges.

Dans le cadre du plan « IA » du LCA // Solutions +, vous êtes aussi amenés à participer ou à piloter différents projets transversaux. En tant que collaborateur rattaché directement au directeur, vous intégrerez une équipe diversifiée où règne une grande solidarité.

Nous entretenons une bonne ambiance de travail, caractérisée par un engagement et une exigence élevée en raison du sens de notre mission. Déplacements fréquents en province pour accompagner les coopératives.

Profil

Diplômé d’un Bac+5 avec une spécialité ” data analyst”, vous êtes orientés clients Externes et Internes, vous êtes rigoureux, tenace et patient, dynamique et vous avez un bon esprit de synthèse.

• Vous êtes autonome et faites preuve d’initiatives,

• Vous maîtrisez les systèmes d’informations et logiciels (SAP, Excel, Web Analytics, BI, SAS, VBA, Python etc.),

• Vous avez des qualités relationnelles, vous appréciez le travail en équipe avec de multiples interlocuteurs tout en sachant travailler en autonomie,

• Esprit d’initiative, rigoureux, organisé, réactif et d’esprit analytique Vous avez une première expérience (5 ans d’expérience minimum).dans la gestion de projet et les OAD en milieu agricole.

Informations complémentaires

• Contrat à durée indéterminée à temps plein

• Lieu : A définir

• Possibilité de 4 jours de télétravail par semaine

• Déplacements fréquents France entière

• Rémunération : selon profil

Source link

07Apr

AI Engineer

Writer is looking for an AI Engineer to join our expanding team of AI experts.

At Writer, we believe in using the power of AI to unlock the potential of the enterprise. With the help of our AI engineer, we can continue to build the most advanced language model available in the industry and revolutionize how companies interact with AI. We’re looking for a creative problem solver who has a deep understanding of NLP and ML technologies and who can help us create powerful and meaningful applications of AI. If you’re passionate about using AI to transform the enterprise, then we want to hear from you.

As an AI Engineer at Writer, you will play a pivotal role in developing and implementing state-of-the-art generative AI models and algorithms. Collaborating closely with our diverse and dynamic team of software developers, you will have the opportunity to design and deploy AI solutions that drive our innovative products.

You will report to our CTO.

🦸🏻‍♀️ Your responsibilities

Conduct cutting-edge research, design, and develop generative AI models and algorithms to solve complex problems.
Collaborate closely with our talented data scientists and machine learning engineers to preprocess and analyze large datasets for training AI models.
Train and fine-tune generative AI models using various techniques, including deep learning, reinforcement learning, and evolutionary algorithms.
Evaluate and validate the performance of AI models through rigorous testing and experimentation.
Collaborate with our skilled software developers to seamlessly integrate AI models into production systems, ensuring scalability and efficiency.
Stay up-to-date with the latest advancements in AI and machine learning research, and proactively suggest improvements to enhance our generative AI capabilities.
Document research findings, methodologies, and implementation details to share knowledge with internal and external stakeholders.
Collaborate closely with cross-functional teams to understand business requirements and translate them into innovative AI solutions.

⭐️ Is this you?

Bachelor’s degree in Computer Science, Engineering, Mathematics, or related field
5+ years of professional experience in software engineering AI/ML development including:
- Machine learning algorithms and techniques
- Proficiency in programming languages such as Python, PyTorch
- Professional experience with LLMs and big-scale models
- Knowledge of AI protocols and standards
- AI development platforms and tools
Strong analytical and problem-solving skills
Ability to communicate complex ideas and concepts effectively
Ability to work independently and collaboratively with

Curious to learn more about who we are and how we operate? Visit us here

🍩 Benefits

We don’t spend frivolously, but we do take care of our own. Besides smart, sincere colleagues and a vibrant work environment, we are proud to offer:
Employer-covered medical plans, dental, vision, and life insurance
FSA
Competitive parental leave policy (parents actually work here!)
Generous PTO
Company stock options
401k plan with employer matching
Flexible schedules

Writer is an equal opportunity employer and is committed to diversity. We don’t make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

07Apr

Data Annotator/AI Data Trainer – Data Scientist (Contractor)

Who are we?

Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.

We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.

Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is the one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.

Join us on our mission and shape the future!

Why this role?

At Cohere, we’re obsessed with language and technology— we believe we need great writers and developers and always will. We also believe that remarkable talent, enthusiasm, and creative thinking add up to great work. We’re looking for someone with superb python and data science skills to join our team and help shape the future of language technology. The most successful candidate will be a quick learner who is excited to train our model by working on a wide variety of writing and code based prompts.

We are on a mission to build machines that understand the world and make them safely accessible to all. Data quality is foundational to this process. Machines (or Large Language Models to be exact) learn in similar ways to humans – by way of feedback.

Our AI Data Trainers ensure that all samples fed to our AI model are well-written, technically sound and useful to the end user. By creating content that data scientists would find useful, working python, or on use cases you will be an essential component of improving our Large Language Model’s performance for iterations to come, thus having a lasting impact on Cohere’s tech.

Please Note: This is role may require occasional work on-site at our Soho office in London, UK. We are looking for candidates who are able to commit 12-24 hours a week minimum to this project.

As a AI Data Trainer, you will:

- Spend the majority of your time writing or reading/proofreading code and natural language to create perfect samples to train our models.
- Label, proofread, and improve machine-written and human-written code.
- Raise the bar continually by writing new code that is of exceptional quality to solve a variety of tasks, with a particular focus on data analytics.
- Adeptly vary the style, functionality of code examples.
- Follow our style guide, and make recommendations on unique situations that fall outside of its scope.
- Work with intense attention to detail while citing sources of information.

You might be a good fit if you are:

- 2+ years of industry experience working on real-world data science problems and pipelines. You excel in data analysis and visualization.
- a meticulous coder with an eye for readability, with experience in python and industry standard data science packages (numpy, pandas, matplotlib, sqlite, or others).
- Able to use sql syntax writing and workflows.
- You have good familiarity with file/data formats, such as markdown, json, xml, yaml, html.
- A thoughtful and thorough code reviewer. You’ve spent time re-writing, proofreading, and giving feedback on others’ code in a previous role. You’ve worked with a code style guide before and enjoyed it.

Other things you’ll need:

- Located in the UK.
- A fast, thorough reader with great comprehension skills.
- A curiosity about ML or AI or LLMs, (bonus points if you have any experience in these).
- Expert in web-based research skills that you’ve used for your code before.
- Ability to follow complex instructions, navigate ambiguity and work independently in a remote or hybrid environment.

Interview Process:

– 20-30 minute video interview

– Technical assessment

– 30 minute final interview

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants of all kinds and are committed to providing an equal opportunity process. Cohere provides accessibility accommodations during the recruitment process. Should you require any accommodation, please let us know and we will work with you to meet your needs.

Our Perks:

🤝 An open and inclusive culture and work environment

🧑‍💻 Work with cutting-edge AI technology

🪴 A vibrant & central location

🥨 A great selection of office snacks

🏆 Performance-based incentives

07Apr

Data Analyst

Florence, Florence, Italy

MAS Management Network è una società di consulenza manageriale e sta ricercando un “Process and data junior analyst” per nostre attività a Firenze.

La figura ideale è un neolaureato in ingegneria gestionale che abbia già maturato una breve esperienza nel settore manifatturiero meglio se nel Fashion/Luxury e che abbia una forte propensione all’innovazione. Ha una buona dimestichezza con le analisi quantitative e le soluzioni Informatiche rivolte all’elaborazione dei dati.

Fornirà inoltre supporto in un progetto strategico di consolidamento dei dati gestionali e nella gestione delle attività di analisi e reporting direzionale che coinvolge manager ed utenti aziendali e fornitori tecnici specialisti ingaggiati in delivery di soluzioni avanzate.

Responsabilità:

Analisi di processo
Supporto alla pianificazione e monitoraggio delle attività progettuali
Data Analysis
Manipolazione di dati con tool informatici
Redazione reportistica e documenti per la Direzione
Gestione di attività di specialisti esterni

Il profilo:

Laurea magistrale in Ingegneria Gestionale
dinamico ed operativo con ottima capacità di comunicazione;
capacità di analisi e di gestione del tempo;
capacità di lavoro in autonomia;
ottima dimestichezza con gli strumenti informatici
conoscenza di strumenti di gestione dei dati: Database, SQL, Excel avanzato
conoscenza di strumenti di analisi quantitativa: R, Python, pacchetti matematico/statistici
buona conoscenza della lingua inglese.

La sede di lavoro è Firenze.

VALUTIAMO SOLO CANDIDATI GIA’ DOMICILIATI IN ZONA

Contratto

Contratto di assunzione come lavoratore dipendente.

Sede di lavoro: Firenze
È richiesta disponibilità ad effettuare trasferte sul territorio nazionale

07Apr

AI Social Media Manager

AI your people will love. That’s our vision, and it contains multitudes 🙂

your people

will love.

We’re filling a big need for generative AI built ground-up for the needs of enterprises and embraced by their teams. As generative AI became a board-level initiative almost overnight for most enterprises, our market matured dramatically and we benefit from that tailwind. Our product is highly differentiated, our roadmap is transformative, and our customers are happy and vocal.

📐 About this role

We’re seeking a talented and experienced Social media manager to join our amazing marketing team — someone who’s a master of their craft. Someone who keeps up with trends, but who leads with data, creativity, and strategic thinking. Someone who can speak the language of executive and technical audiences in a human-to-human way. Someone who sees generative AI as a key that’ll unlock their most creative ideas and help bring them to life. Someone who can make a post go viral faster than a sneeze in a crowded elevator (AI wrote that one).

Is this you? If so, read on.

As Writer’s first Social media manager, you’ll be responsible for executing social media strategies that drive engagement, brand awareness, and lead generation. You’ll play a crucial role in simplifying complex technical and business topics and effectively communicating our brand and value proposition to our target audience. Excellent copywriting and content creation skills, a deep understanding of social media platforms, and experience in fast-growing B2B companies are an absolute must for success in this role.

🦸🏻‍♀️ Your responsibilities

Collaborate with marketing leadership to develop and execute a comprehensive social media strategy that aligns with our brand identity, business objectives, and target audience
Create engaging and compelling social media content, including posts, short videos, infographics, and other multimedia assets
Simplify complex technical topics and communicate them in ‌clear, concise shortform copy that resonates with our target audience of enterprise executives and technical leaders
Curate sharable content from industry experts that engages and informs ‌our social media followers
Get employees involved in sharing and promoting content created by Writer and other experts in the field
Manage, maintain, and grow our social media channels, including but not limited to LinkedIn, X, Instagram, and YouTube
Monitor social media trends, industry news, and competitor activities to identify opportunities for content creation and engagement
Collaborate with cross-functional teams, including design and product, to ensure social media content aligns with overall marketing initiatives
Engage with our social media community, respond to comments and messages, and foster meaningful conversations with our audience
Leverage social media analytics and reporting tools to track and measure the performance of social media campaigns, providing insights and recommendations for optimization
Stay up-to-date with the latest social media best practices, algorithm changes, and emerging trends, and apply them to enhance our social media presence
Monitor and manage social media advertising campaigns, working closely with the paid marketing team to optimize targeting, messaging, and budget allocation

⭐️ Is this you?

Bachelor’s degree in marketing, communications, or a related field. Additional certifications in social media marketing are a plus
Proven experience (5+ years) as a Social media manager in a fast-growing B2B company, preferably in the technology or SaaS industry. Bonus points for agency experience
Excellent writing and editing skills, with the ability to simplify complex technical topics and communicate them effectively to a non-technical audience
Strong creative thinking and storytelling abilities, with a keen eye for visual aesthetics and the ability to create engaging multimedia content
Keen interest and enthusiasm in using generative AI to accelerate social media content production
Deep understanding of social media platforms, algorithms, and best practices, including LinkedIn, X, Instagram, and YouTube
Strong collaboration skills and the ability to work across teams to develop ideas for social media content
Analytical mindset with the ability to interpret social media data, derive actionable insights, and make data-driven decisions
Self-motivated and proactive, with the ability to work independently and manage multiple projects simultaneously in a fast-paced environment
Proficiency in using social media management and analytics tools, such as Hubspot, PostBeyond, Hootsuite, Buffer, Sprout Social, Google Analytics, and social media listening platforms
Proficiency in using project management software such as Asana, Clickup, and Notion
Proficiency in using visual content creation tools such as Canva, CapCut, and Figma
Knowledge of B2B marketing strategies, lead generation tactics, and demand generation principles is highly desirable

If you’re a creative and strategic thinker with excellent writing skills and a passion for simplifying and humanizing complex technical and business topics, we’d love to have you join our team. Help us build a strong social media presence, engage our target audience, and drive brand awareness and lead generation in our fast-growing generative AI company.

Curious to learn more about who we are and how we operate? Visit us here

🍩 Benefits

We don’t spend frivolously, but we do take care of our own. Besides smart, sincere colleagues and a vibrant work environment, we are proud to offer:
Employer-covered medical plans, dental, vision, and life insurance
FSA
Competitive parental leave policy (parents actually work here!)
Generous PTO
Company stock options
401k plan with employer matching
Flexible schedules

07Apr

Chief Information Security Officer (CISO) at Stg Di Hub

Our roster has an opening with your name on it!

Bally Sports mission is to build a transformative, participatory sports platform, anchored by the most exclusive and relevant live professional games, which provides fans a year-round opportunity to engage with content and communities they are most passionate about.

The Position:
We are seeking an experienced Chief Security Information Officer (CSIO) that will play an integral part in defining the fundamental principles for the protection of Bally Sports’ information resources and the proper controls needed to ensure compliance with internal and external regulations, while supporting the business needs and upholding Bally Sports’ reputation with its employees and customers.

Reporting to the CIO, the CISO position requires an energetic visionary leader who can shape the direction of the cyber program and directly lead program execution. The ideal candidate is a people and thought leader, having significant operational and technology risk management experience in the entertainment and/or media industry. The CISO serves as the process owner of all assurance activities related to the confidentiality, integrity, and availability of customers, third-party vendors, employee, and business information in compliance with the organization’s information security policies. The CISO is responsible for establishing and maintaining Bally Sports security program to ensure that information assets and associated technology, applications, systems, infrastructure and processes are adequately protected in this rapidly growing and evolving, industry-leading ecosystem. The CISO will proactively work across Bally Sports portfolio and partners to implement practices that meet agreed-upon policies and standards for information security. The CISO should also understand IT and must oversee a variety of cybersecurity and risk management activities related to IT to ensure the achievement of organization outcomes where the process is dependent on technology.

In addition to traditional cybersecurity considerations, the CISO will also be responsible for implementing and running the program to support content security measures as well as other infrastructure security for Bally Sports. The CISO is responsible for identifying, evaluating and reporting on legal and regulatory, IT, and cybersecurity risk to information assets, while driving and enabling the bleeding edge media creation, content distribution and business objectives. For example, as Bally Sports focuses directly on engaging with its end consumers, the CISO will need to incorporate consumer privacy regulations into Bally Sports operational capabilities. The CISO should have a strong executive presence, in order to effectively articulate the impact of security considerations to other senior Bally Sports stakeholders. The CISO must also be able to coordinate demands of the organization, constraints and personalities, while maintaining objectivity and a strong understanding that security is foundational for Bally Sports to deliver on its vision, goals and mission.

The Game Plan:
(What you’ll do)

Operationalize Bally Sports security strategy; identifying, tracking, and mitigating security risks encountered across the organization and third parties.
Own and be accountable for the resolution of security issues and threats as they arise.
Collaborate and communicate effectively among Bally Sports executives
Identify new security technologies to help Bally Sports identify, manage, and resolve security threats.
Oversee all security components (hardware, software, network, cloud, and facilities) of the Enterprise including broadcast, sports, gaming, and digital operations.
Provide strategic vision for how security will be seamlessly integrated into components of media production, management, and distribution across linear, digital, and cloud platforms.
Work with Bally Sports managed security services provider to meet agreed-upon SLA’s in addressing security risks.
Lead the overall security program, including schedule, budget, technical design, risk management, and existing operations.
Define and implement a zero-trust policy for Bally Sports.
Establish and scale Identity Access Management (IAM) capabilities focused both within the organization and with its external customers.
Institute security training and build awareness across the enterprise.

The Stats:
(What to bring)

Bachelor’s / Master’s preferred and/or 10+ years working in large-scale security team, preferably in a media/broadcast environment.
10+ years of managing enterprise security risk.
3+ years of experience in cloud-based security policies.
Excellent communication, organization, interpersonal and writing skills.
High level of knowledge associated with incident response activities in a distributed environment.
Strong understanding of data analysis, computer applications, IS security standards, and network architecture.
CISSP, CISM, or CISA certification preferred.
Familiarity with security and media industry standards (ISO 17799, NIST 800 series, TPN, MPAA, etc.) and best practices.
Knowledge of security auditing procedures.

Desired Skills:

Multi-cloud tenancy experience.
Deep knowledge of media technologies.
Experience in the broadcast engineering/technology space.

#Ballys

The Company is committed to fair and equitable compensation practices. This position is eligible for benefits that include participation in a retirement plan, life and disability insurance, health, dental and vision plans, flexible spending accounts, sick leave, vacation time, personal time, parental leave, and employee stock purchase plan. Compensation for this role will be determined by various factors such as a candidates’ relevant work experience, skills, certifications, and geographic location.

Diamond Sports Group, L.L.C, an independently managed and unconsolidated subsidiary of Sinclair Broadcast Group, Inc., is proud to be an Equal Opportunity Employer.

About us:

Diamond Sports Group LLC owns the Bally Sports Regional Sports Networks (RSNs), the nation’s leading provider of local sports. Its 17 owned-and-operated RSNs include Bally Sports Detroit, Bally Sports Florida, Bally Sports Great Lakes, Bally Sports Indiana, Bally Sports Kansas City, Bally Sports Midwest, Bally Sports New Orleans, Bally Sports North, Bally Sports Ohio, Bally Sports Oklahoma, Bally Sports SoCal, Bally Sports South, Bally Sports Southeast, Bally Sports Southwest, Bally Sports Sun, Bally Sports West, and Bally Sports Wisconsin. The Bally Sports RSNs serve as the TV home to close to half of all MLB, NHL and NBA teams based in the United States. Diamond Sports Group also has a joint venture in Marquee, the home of the Chicago Cubs, and a minority interest in the YES Network, the local destination for the New York Yankees and Brooklyn Nets. Diamond RSNs produce approximately 5,000 live local professional telecasts each year in addition to a wide variety of locally produced sports events and programs each year.

Source link

07Apr

SiloFuse: Transforming Synthetic Data Generation in Distributed Systems with Enhanced Privacy, Efficiency, and Data Utility

In an era when data is as valuable as currency, many industries face the challenge of sharing and augmenting data across various entities without breaching privacy norms. Synthetic data generation allows organizations to circumvent privacy hurdles and unlock the potential for collaborative innovation. This is particularly relevant in distributed systems, where data is not centralized but scattered across multiple locations, each with its privacy and security protocols.

Researchers from TU Delft, BlueGen.ai, and the University of Neuchatel introduced SiloFuse in search of a method that can seamlessly generate synthetic data in a fragmented landscape. Unlike traditional techniques that struggle with distributed datasets, SiloFuse introduces a groundbreaking framework that synthesizes high-quality tabular data from siloed sources without compromising privacy. The method leverages a distributed latent tabular diffusion architecture, ingeniously combining autoencoders with a stacked training paradigm to navigate the complexities of cross-silo data synthesis.

SiloFuse employs a technique where autoencoders learn latent representations of each client’s data, effectively masking the true values. This ensures that sensitive data remains on-premise, thereby upholding privacy. A significant advantage of SiloFuse is its communication efficiency. The framework drastically reduces the need for frequent data exchanges between clients by utilizing stacked training, minimizing the communication overhead typically associated with distributed data processing. Experimental results testify to SiloFuse’s efficacy, showcasing its ability to outperform centralized synthesizers regarding data resemblance and utility by significant margins. For instance, SiloFuse achieved up to 43.8% higher resemblance scores and 29.8% better utility scores than traditional Generative Adversarial Networks (GANs) across various datasets.

SiloFuse addresses the paramount concern of privacy in synthetic data generation. The framework’s architecture ensures that reconstructing original data from synthetic samples is practically impossible, offering robust privacy guarantees. Through extensive testing, including attacks designed to quantify privacy risks, SiloFuse demonstrated superior performance, reinforcing its position as a secure method for synthetic data generation in distributed settings.

Research Snapshot

In conclusion, SiloFuse addresses a critical challenge in synthetic data generation within distributed systems, presenting a groundbreaking solution that bridges the gap between data privacy and utility. By ingeniously integrating distributed latent tabular diffusion with autoencoders and a stacked training approach, SiloFuse surpasses traditional efficiency and data fidelity methods and sets a new standard for privacy preservation. The remarkable outcomes of its application, highlighted by significant improvements in resemblance and utility scores, alongside robust defenses against data reconstruction, underscore SiloFuse’s potential to redefine collaborative data analytics in privacy-sensitive environments.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 39k+ ML SubReddit

Hello, My name is Adnan Hassan. I am a consulting intern at Marktechpost and soon to be a management trainee at American Express. I am currently pursuing a dual degree at the Indian Institute of Technology, Kharagpur. I am passionate about technology and want to create new products that make a difference.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

Source link

07Apr

API Strategies for Effective Database Management and Integration

API (Application Programming Interface) strategies are pivotal in effective database management and integration. In today’s fast-paced digital landscape, where organizations operate across various databases and applications, seamlessly integrating these components is not just beneficial; it’s necessary for operational efficiency, insightful data analysis, and delivering superior customer experiences. APIs serve as the linchpins in this integration process, providing a structured and secure way for disparate systems to communicate and exchange data. Let’s delve into the essence of API strategies, comparing different approaches and elucidating their advantages and disadvantages through a detailed case study.

API Strategies for Database Management

APIs are the bridge that allows applications to interact with databases seamlessly. This interaction happens without the applications needing to know the intricacies of the database schema or the specific programming language in which the database operations are written. By abstracting these details, APIs simplify the development process, bolster security measures, and ensure that systems remain modular and easy to maintain. The strategic selection of an API can have far-reaching effects on integration ease, system performance, scalability, and the overall lifecycle of the application and database ecosystem.

Types of APIs in Database Management

The landscape of APIs in database management is diverse, with each type catering to specific needs and scenarios:

RESTful APIs: The go-to for many web services, Representational State Transfer (REST) APIs utilize simple HTTP requests to create, read, update, and delete data. These APIs can work with various data formats, including text, JSON, and XML, making them highly versatile and easy to implement in multiple environments.
SOAP APIs: The Simple Object Access Protocol (SOAP) APIs are known for their strict standards and high level of security, which makes them a favorite among enterprise-level applications where data security and integrity are paramount. Despite their robustness, they tend to have more overhead than RESTful APIs.
GraphQL APIs: A relatively new API strategy, GraphQL offers an incredibly efficient query language for complex systems with interrelated data. It allows clients to request the needed data, reducing bandwidth and improving response times.

A Comparative Look at API Strategies

Let’s compare these API strategies to highlight their distinct characteristics:

A Comparative Snapshot of API Strategies

Pros and Cons of the API Strategies

RESTful APIs:

Pros: Their lightweight nature and simplicity make RESTful APIs incredibly approachable for developers. They offer high flexibility, allowing quick adjustments and updates without significant overhead.
Cons: They may not offer the same security features as SOAP, making them less suitable for highly sensitive applications. Their efficiency can wane in scenarios requiring complex queries over multiple resources.

SOAP APIs:

Pros: They provide a highly standardized approach to API design, with robust security features and support for transactional integrity through ACID compliance. This makes them ideal for enterprise environments where consistency and security are non-negotiable.
Cons: The complexity and verbosity of SOAP can lead to slower performance, especially in web applications where quick responses are essential.

GraphQL APIs:

Pros: They dramatically reduce the need for multiple queries by allowing clients to specify exactly what data they need. This specificity can lead to significant performance improvements and greater flexibility in handling complex data relationships.
Cons: The complexity of setting up a GraphQL API can be a barrier for teams unfamiliar with its query language. Moreover, optimizing query performance requires a deeper understanding of the underlying systems.

Case Study: E-Commerce Integration Challenge

Consider an e-commerce company facing the challenge of integrating its online shopping platform with a legacy inventory management system. The integration is needed to ensure real-time synchronization of product information, stock levels, and order data, improving operational efficiency and customer satisfaction.

Solution: The company can opt for GraphQL APIs for this integration. The decision can be driven by the need for efficient, real-time data retrieval and updates across complex, interrelated datasets encompassing products, stocks, and orders.
Implementation Process:
- A GraphQL server can be developed as an intermediary capable of interacting with the shopping platform’s database and the inventory system.
- The implementation can leverage GraphQL’s powerful query capabilities to manage and synchronize data efficiently across systems, ensuring that product listings on the e-commerce site remain accurate and up-to-date.
Outcomes:
- Using GraphQL can reduce unnecessary data over-fetching and under-fetching, optimizing server load and response times.
- Customers can enjoy a better shopping experience with real-time visibility into product availability.
- Due to GraphQL’s flexible query language, the development team may find it easier to address complex data retrieval and manipulation requirements.

Conclusion

The strategic selection and implementation of APIs are fundamental to successful database management and integration. Whether opting for the simplicity and flexibility of RESTful APIs, the security and robustness of SOAP, or the efficiency and precision of GraphQL, the choice should align with the project’s specific needs around security, data complexity, and performance. The discussed comparative analysis and case study illustrate how a well-considered API strategy can facilitate seamless integration, enhance system interoperability, and drive digital transformation efforts forward.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

Source link
Adnan Hassan:

06Apr

How to Build a Plagiarism Detector Using Python [Part 1]

In this post, I will show you how to detect the percentage of plagiarism in a piece of text. A direct, practical solution I created and tested!

The idea is very simple, acting as a perfect starting point to check plagiarism for any piece of text. I will explain the approach step by step with a practical example, so let’s start!

How Do Plagiarism Checkers Work?

Plagiarism checkers scan for matches between your text and existing texts and give a similarity percentage at the end.

Behind the scenes, most of these tools surf the web and scan your text for similarities against existing content found on the internet.

Exact matches are highlighted easily by comparison. However, more complex checkers can also identify non-exact matches, aka paraphrased text.

But before applying all that, these plagiarism tools start by splitting text into smaller chunks like phrases, sentences, and paragraphs; notice how I didn’t say “splitting into individual words.” That’s because words are independent, resulting in a less effective plagiarism test.

So, which chunking method should you choose?

Each approach has its pros and cons:

For example, if you choose to chunk by sentences, you’d get a more accurate result; however, the code will need more time to execute.

Moreover, this method wouldn’t be fair to apply if you’re examining someone (Teachers using it on students) because there is a probability that some general sentences may already have been used by someone on the internet, and the person didn’t copy them.

Unlike the chunking-by-paragraphs method, which would result in a less accurate result but less time to execute. This method is the go-to one when running a plagiarism detector on students.

Here are the results I got when I tried both methods:

Difference between chunking by sentences and chunking by paragraphs

In the end, you choose the method based on your needs.

My Implementation

Let’s keep things simple with a real practical example! Here is what we need:

1- A function that takes care of the chunking of our text

2- A function that surfs the web and checks if this chunk exists

3- Add up all the results and get the percentage

Step 1: Text Chunking

Let’s make it dynamic!

def chunk_text(text, chunk_by) -> List[str]:
    if chunk_by == "sentence":
        sentences = re.split(r'(?<!\d)[.?!](?!\d)', text)
        sentences = [sentence.strip() for sentence in sentences if sentence.strip()]
        return sentences
    elif chunk_by == "paragraph":
        paragraphs = [paragraph.strip() for paragraph in text.split("\n") if paragraph.strip()]
        return paragraphs
    else:
        raise ValueError("Invalid chunk_by value. Choose 'sentence' or 'paragraph'.")

This function takes as input the text and your chosen chunking method, then if you choose:

By Sentence: I used a very straightforward method: I split whenever I find a ‘.’ or ‘!’ or ‘?’ between sentences.
By Paragraph: I used a similar approach to the one above, which splits the input whenever there’s a new line between paragraphs. In Python, the new line is defined as \n.

This dynamic approach makes it easier to switch to whichever method is based on your liking. Plus, you can see the experiment yourself and see how the accuracy changes depending on the text and method used.

Step 2: Surf the Web

Now that we have split the text into chunks, we need to take each chunk, put it between double quotes like “[chunk]”, and search for if it matches something on the internet.

Here’s an example of a unique chunk:

Add double quotes when searching on google to search for exact matches

As you can see, no results were found for “Learnwithhasan is the best website” although it’s a well-known fact 😂

💡 Tip 💡

When you’re searching for an exact match of something you should delimit it between double quotes. Like this search engine you’re using knows that you’re looking for this exact phrase and not normal searching.

Back to our code:

def search_chunk(chunk) -> bool:
    try:
        search_results = search_with_serpapi(f"\"{chunk}\"")
        found = len(search_results) > 0
        return found
    except Exception as e:
        print(f"An error occurred: {e}")
        return False

In this function, I used my Library SimplerLLM, specifically a method that uses SerperAPI to search on Google from the code.

To access Google’s search engine from your code, you would need an API and its corresponding code. However, using SimplerLLM, the function is already built-in, and you just call it using the “search_with_serpapi” method.

But, you need to generate your API key from their website, create a .env file, and add your key like this:

SERPER_API_KEY = "YOUR_API_KEY"

So, using the above function, each chunk is searched for on Google, and if a result exists, it returns True; otherwise, it returns False.

Step 3: Calculating the Result

Now it’s time to take these Trues and Falses and turn them into a percentage:

def calculate_plagiarism_score(text, chunk_by) -> float:
    chunks = chunk_text(text, chunk_by)
    total_chunks = len(chunks)
    plagiarised_chunks = 0
    for chunk in chunks:
        if search_chunk(chunk):
            plagiarised_chunks += 1
    
    plagiarism_score = (plagiarised_chunks / total_chunks) * 100 if total_chunks > 0 else 0
    return plagiarism_score

This function works by first calling the chunking method explained in Step 1, and then counting the total number of these chunks.

Using step 2, we determine whether each chunk is available on the web. If it returns True, it increases the count of plagiarized chunks.

After checking all chunks, the plagiarism score is calculated by dividing the number of plagiarized chunks by the total number of chunks, multiplying by 100 to get a percentage. Finally, it returns the plagiarism score as a decimal number(float).

Step 4: Running the Script

All the above methods wouldn’t generate anything if you didn’t give it any input and print the result.

#MAIN SECTION
start_time = time.time() 
text = "YOUR_TEXT" # The Input Text

chunk_by = "sentence"  # "sentence" or "paragraph"
plagiarism_score = calculate_plagiarism_score(text, chunk_by)

end_time = time.time()  # Record the end time
runtime = end_time - start_time  # Calculate the runtime

print(f"Plagiarism Score: {plagiarism_score}%")
print(f"Runtime: {runtime} seconds")  # Print the runtime

In this section of the code, you need to enter the text you want to run the plagiarism checker on, pick your preferred method of chunking, and print the results!

You’ll even get the time it took to generate the results (we’ll use it later🤫)

The Role of SimplerLLM

SimplerLLM is an open-source Python library designed to simplify interactions with large language models (LLMs). It offers a unified interface for different LLM providers and a suite of tools to enhance language model capabilities.

I created it to facilitate coding, and it did indeed save me a lot of time. But the main reason I’m using it in this script is that I’m planning on improving this code more and making it detect similarities, too, not just exact copies of the text. So, keep an eye out for the Semantic Plagiarism Checker Post!

Advanced Technique

Now, although the script we created is working properly, why don’t we improve it a little?

For example, when we find that the chunk is available on a webpage somewhere, we can fetch the URLs of these web pages. This simple tweak to the code would make the results of this script a lot more interesting, especially if you turned it into a tool with a nice UI.

Here’s what the new code will look like:

def search_chunk(chunk) -> List[str]:
list = []
    try:
        search_results = search_with_serpapi(f"\"{chunk}\"")
        found = len(search_results) > 0
        if (found):
            list.append(found)
            list.append(search_results[0].URL)
            return list
        else:
            list.append(found)
            list.append("None")
            return list
    except Exception as e:
        print(f"An error occurred: {e}")
        list.append(False)
        list.append("None")
        return list 

def calculate_plagiarism_score(text, chunk_by) -> float:
    chunks = chunk_text(text, chunk_by)
    total_chunks = len(chunks)
    plagiarised_chunks = 0
    counter = 1
    for chunk in chunks:
        print(f"Chunk {counter} : {chunk} .... {search_chunk(chunk)[0]} .... {search_chunk(chunk)[1]}")
        counter += 1
        if search_chunk(chunk)[0]:
            plagiarised_chunks += 1

    plagiarism_score = (plagiarised_chunks / total_chunks) * 100 if total_chunks > 0 else 0
    return plagiarism_score

As you can see, I edited the “search_chunk” function so that it returns a list containing a True/ False if it found an existing duplicate and the link to the webpage that contains the same chunk. Plus, I added a print statement in the “calculate_plagiarism_score” function to print each chunk, its number, True/False, and the URL of the webpage.

Here’s what the result will look like:

Performance Optimization

A major limitation of the above script is that running it on a large amount of data would be inefficient, like multiple blog posts at a time. What happens here is every chunk will have to be searched for on Google to see if there is existing content that matches it.

So, How can we fix this? There are two approaches we can try.

Approach 1:

The first is leaving the code logic as is but applying parallel programming or multi-threading to it so that it runs on multiple processors, making it much faster. The code will look something like this:

def calculate_plagiarism_score(text, chunk_by) -> float:
    """
    Calculates the plagiarism score of a given text by chunking it and checking each chunk for plagiarism in parallel.
    """
    chunks = chunk_text(text, chunk_by)
    total_chunks = len(chunks)

    with ThreadPoolExecutor() as executor:
        # Use map to apply search_chunk to all chunks. Results are returned in the order the calls were made.
        results = executor.map(search_chunk, chunks)
    
    plagiarised_chunks = sum(results)

    plagiarism_score = (plagiarised_chunks / total_chunks) * 100 if total_chunks > 0 else 0
    return plagiarism_score

The “calculate_plagiarism_score” is the only function that gets updated because all the work is happening in it, so it is the function with the most run-time. Therefore, we apply the ThreadPoolExecuter() to distribute the workload over multiple threads, which decreases the program runtime.

To use this built-in function, you need to import its corresponding module like this:

from concurrent.futures import ThreadPoolExecutor

Now let’s run the normal code and the one we just optimized and see the difference in speed:

Increase the speed of Plagiarism Checker by running it using threads

See the difference? The optimized one is almost 10x faster 😲

The normal one took 29 seconds to run, while the optimized one (using threads) took only 4 seconds

Approach 2

The other approach is to decrease the number of search-engine calls and index search results on our local machine somewhere. So, now, instead of searching on the internet, if there is an existing matching chunk, we search in our indexed search results.

Now, if I want to address this approach, it may take like 100 pages, so I’ll leave them for you to experiment 😂

If you tried it, make sure to share it with us, and if you have other better approaches to improve the plagiarism detector, drop it in the comments below!

Earn Points for Every Share!

Source link
Hasan Aboul Hasan:

1 … 306 307 308 309