What criteria do you think is absolutely essential for an AI model to qualify as open source?

3.6k views5 Comments

Sort by:

CEO in Services (non-Government)a year ago

I can access the code and training data sets w/transparency.

1 Reply

no titlea year ago

or ... use the latest definitions: https://siliconangle.com/2024/10/28/osi-clarifies-makes-ai-systems-open-source-open-models-fall-short/

CISO/CPO & Adjunct Law Professor in Finance (non-banking)a year ago

Comment below is not legal advice, consult your lawyer.

I agree with James Harris on the criteria for open source, however, a related issue is the type of use which is permitted under the applicable license(s). There are dozens of licensing models applicable for open-source projects from the commonly used Apache 2.0 to the less well-known Chicken Dance License v0.2 (CDL) created by Andrew Harris. The CDL allows a prospective user to do the chicken dance on social media instead of distributing the new source code created from the licensed code.
There is a common misconception that open source means free to use for any purpose, some LLMs licensing restrictions prohibit commercial use. Unlike the example above, not all license holders take a light-hearted approach to intellectual property rules.
Ensure you are aware of any restrictions or affirmative requirements (such as public redistribution) at the onset of the project to prevent costly legal issues later.

VP of ITa year ago

When we refer to open source, the model’s architecture and source code should be accessible to anyone, under a genuine open-source license. Transparency regarding data sources, datasets, and comprehensive documentation is essential, with particular emphasis on data sources, as they reveal what the model has been trained on, potentially raising ethical and privacy concerns. Ideally, the model should be community-driven, allowing developers to contribute and enhance it collaboratively. A significant advantage of this open-source approach is its adherence to ethical and privacy standards, as open access to data and source code facilitates compliance and accountability.

Director of ITa year ago

From our open source aiSSEMBLE solution lead at Booz Allen are the following criteria:
1. Source code transparency: The source code should be publicly accessible, including associated dependencies.
2. Model is Accessible: The model should be publicly accessible in a repository or model catalog
3. Model Weights are accessible: Able to see the model weights that govern how a model behaves and available in serialized format that enables transfer learning (fine-tuning).
4. Technical Documentation: Documentation sufficient so that a third party is able to install, deploy and execute inference on the model.
5. License Requirements: The specific open-source license and terms of use should be clearly identified.
One additional consideration would be Reproducibility Requirements: The requirements to reproduce the model from scratch should be clearly articulated, including:
a. Versioned dataset(s) used and their location
b. Hardware requirements for training
c. Hyperparameters and other information

Content you might like

If you were designing an “AI leader” role at your organization, which of the following would be non-negotiable responsibilities? Select all that apply

Drive AI innovation 42%

Develop AI strategy 60%

Build AI team 36%

Establish use cases 26%

Track market trends 30%

New vendor discovery 21%

Design AI architecture 25%

Manage AI governance 30%

Improve AI delivery models 11%

Something else (share in comments!) 2%

View Results

When exploring new AI-enabled security tools, do you give preference to products from your existing vendors over AI startups?

Yes 50%

It depends on the use case 50%

View Results

Interested to hear from Microsoft Sentinel users - we are having big problems parsing logs from non-Azure native log sources and use cases are taking forever to fire (apparently a new bug, my Cyber Defense manager tells me). Also, our log ingestion is costing a fortune as we ramp up the use of the platform . I'm looking at swapping out to Elastic. Are any other folks having similar problems?

How does your organization balance using AI to make entry-level jobs more efficient with the need to maintain a robust talent pipeline? Are there specific roles you're purposely shielding from automation to retain pathways for employee development?

Has anyone executed any valuable use cases using MS CoPilot they'd be willing to share? Welcome thoughts here or links to previous posts where this might be a topic.

What criteria do you think is absolutely essential for an AI model to qualify as open source?

Sort by:

Content you might like

If you were designing an “AI leader” role at your organization, which of the following would be non-negotiable responsibilities? Select all that apply

When exploring new AI-enabled security tools, do you give preference to products from your existing vendors over AI startups?

How does your organization balance using AI to make entry-level jobs more efficient with the need to maintain a robust talent pipeline? Are there specific roles you're purposely shielding from automation to retain pathways for employee development?

Has anyone executed any valuable use cases using MS CoPilot they'd be willing to share? Welcome thoughts here or links to previous posts where this might be a topic.

What sets us apart?

RELATED ONE-MINUTE INSIGHTS

CrowdStrike Outage: Impact And Recovery

2024 Software Engineering Priorities and Challenges

Improving Software Developer Experience

Current State of Software Developer Experience

Emerging Software Security Risks: How Are Tech Leaders Preparing for 2024?

Take Your Insights On-the-Go