برچسب: Multimodal

  • Multimodal image attachment is now available for Gemini in Android Studio



    Posted by Paris Hsu – Product Manager, Android Studio

    At every stage of the development lifecycle, Gemini in Android Studio has become your AI-powered companion, making it easier to build high quality apps. We are excited to announce a significant expansion: Gemini in Android Studio now supports multimodal inputs, which lets you attach images directly to your prompts! This unlocks a wealth of new possibilities that improve team collaboration and UI development workflows.

    You can try out this new feature by downloading the latest Android Studio canary. We’ve outlined a few use cases to try, but we’d love to hear what you think as we work through bringing this feature into future stable releases. Check it out:

    https://www.youtube.com/watch?v=f_6mtRWJzuc

    Image attachment – a new dimension of interaction

    We first previewed Gemini’s multimodal capabilities at Google I/O 2024. This technology allows Gemini in Android Studio to understand simple wireframes, and transform them into working Jetpack Compose code.

    You’ll now find an image attachment icon in the Gemini chat window. Simply attach JPEG or PNG files to your prompts and watch Gemini understand and respond to visual information. We’ve observed that images with strong color contrasts yield the best results.

    New “Attach Image File” icon in chat window

    1.1 New “Attach Image File” icon in chat window

    Example of multimodal response in chat

    1.2 Example multimodal response in chat

    We encourage you to experiment with various prompts and images. Here are a few compelling use cases to get you started:

      • Rapid UI prototyping and iteration: Convert a simple wireframe or high-fidelity mock of your app’s UI into working code.
      • Diagram explanation and documentation: Gain deeper insights into complex architecture or data flow diagrams by having Gemini explain their components and relationships.
      • UI troubleshooting: Capture screenshots of UI bugs and ask Gemini for solutions.

    Rapid UI prototyping and iteration

    Gemini’s multimodal support lets you convert visual designs into functional UI code. Simply upload your image and use a clear prompt. It works whether you’re working from your own sketches or from a designer mockup.

    Here’s an example prompt: “For this image provided, write Android Jetpack Compose code to make a screen that’s as close to this image as possible. Make sure to include imports, use Material3, and document the code.” And then you can append any specific or additional instructions related to the image.

    Example prompt: 'For this image provided, write Android Jetpack Compose code to make a screen that's as close to this image as possible. Make sure to include imports, use Material3, and document the code.'

    Example of generating Compose code from high-fidelity mock using Gemini in Android Studio

    2. Example of generating Compose code from high-fidelity mock using Gemini in Android Studio (code output)

    For more complex UIs, refine your prompts to capture specific functionality. For instance, when converting a calculator mockup, adding “make the interactions and calculations work as you’d expect” results in a fully functional calculator:

    Example prompt to convert a calculator mock up

    Example of generating Compose code from high-fidelity mock using Gemini in Android Studio

    3. Example of generating Compose code from wireframe via Gemini in Android Studio (code output)

    Note: this feature provides an initial design scaffold. It’s a good “first draft” and your edits and adjustments will be needed. Common refinements include ensuring correct drawable imports and importing icons. Consider the generated code a highly efficient starting point, accelerating your UI development workflow.

    Diagram explanation and documentation

    With Gemini’s multimodal capabilities, you can also try uploading an image of your diagram and ask for explanations or documentation.

    Example prompt: Upload the Now in Android architecture diagram and say “Explain the components and data flow in this diagram” or “Write documentation about this diagram”.

    Example of generating Compose code from high-fidelity mock using Gemini in Android Studio

    4. Example of asking Gemini to help document the NowInAndroid architecture diagram

    UI troubleshooting

    Leverage Gemini’s visual analysis to identify and resolve bugs quickly. Upload a screenshot of the problematic UI, and Gemini will analyze the image and suggest potential solutions. You can also include relevant code snippets for more precise assistance.

    In the example below, we used Compose UI check and found that the button is stretched too wide in tablet screens, so we took a screenshot and asked Gemini for solutions – it was able to leverage the window size classes to provide the right fix.

    Example of generating Compose code from high-fidelity mock using Gemini in Android Studio

    5. Example of fixing UI bugs using Image Attachment (code output)

    Download Android Studio today

    Download the latest Android Studio canary today to try the new multimodal features!

    As always, Google is committed to the responsible use of AI. Android Studio won’t send any of your source code to servers without your consent. You can read more on Gemini in Android Studio’s commitment to privacy.

    We appreciate any feedback on things you like or features you would like to see. If you find a bug, please report the issue and also check out known issues. Remember to also follow us on X, Medium, or YouTube for more Android development updates!





    Source link

  • Multimodal for Gemini in Android Studio, news for gaming devs, the latest devices at MWC, XR and more!



    Posted by Anirudh Dewani – Director, Android Developer Relations

    We just dropped our Winter episode of #TheAndroidShow, on YouTube and on developer.android.com, and this time we were in Barcelona to give you the latest from Mobile World Congress and across the Android Developer world. We unveiled a big update to Gemini in Android Studio (multi-modal support, so you can translate image to code) and we shared some news for games developers ahead of GDC later this month. Plus we unpacked the latest Android hardware devices from our partners coming out of Mobile World Congress and recapped all of the latest in Android XR. Let’s dive in!

    https://www.youtube.com/watch?v=-Drt3YeIMuc

    Multimodality image-to-code, now available for Gemini in Android Studio

    At every stage of the development lifecycle, Gemini in Android Studio has become your AI-powered companion. Today, we took the wraps off a new feature: Gemini in Android Studio now supports multimodal image to code, which lets you attach images directly to your prompts! This unlocks a wealth of new possibilities that improve collaboration and design workflows. You can try out this new feature by downloading the latest canary – Android Studio Narwal, and read more about multimodal image attachment – now available for Gemini in Android Studio.

    https://www.youtube.com/watch?v=f_6mtRWJzuc

    Building excellent games with better graphics and performance

    Ahead of next week’s Games Developer Conference (GDC), we announced new developer tools that will help improve gameplay across the Android ecosystem. We’re making Vulkan the official graphics API on Android, enabling you to build immersive visuals, and we’re enhancing the Android Dynamic Performance Framework (ADPF) to help you deliver longer, more stable gameplay sessions. Learn more about how we’re building excellent games with better graphics and performance.

    https://www.youtube.com/watch?v=SkkkwCEkO6I

    A deep dive into Android XR

    Since we unveiled Android XR in December, it’s been exciting to see developers preparing their apps for the next generation of Android XR devices. In the latest episode of #TheAndroidShow we dove into this new form factor and spoke with a developer who has already been building. Developing for this new platform leverages your existing Android development skills and familiar tools like Android Studio, Kotlin, and Jetpack libraries. The Android XR SDK Developer Preview is available now, complete with an emulator, so you can start experimenting and building XR experiences immediately! Visit developer.android.com/xr for more.

    https://www.youtube.com/watch?v=AkKjMtBYwDA

    New Android foldables and tablets, at Mobile World Congress

    Mobile World Congress is a big moment for Android, with partners from around the world showing off their latest devices. And if you’re already building adaptive apps, we wanted to share some of the cool new foldable and tablets that our partners released in Barcelona:

      • OPPO: OPPO launched their Find N5, their slim 8.93mm foldable with a 8.12” large screen – making it as compact or expansive as needed.
      • Xiaomi: Xiaomi debuted the Xiaomi Pad 7 series. Xiaomi Pad 7 provides a crystal-clear display and, with the productivity accessories, users get a desktop-like experience with the convenience of a tablet.
      • Lenovo: Lenovo showcased their Yoga Tab Plus, the latest powerful tablet from their lineup designed to empower creativity and productivity.

    These new devices are a great reason to build adaptive apps that scale across screen sizes and device types. Plus, Android 16 removes the ability for apps to restrict orientation and resizability at the platform level, so you’ll want to prepare. To help you get started, the Compose Material 3 adaptive library enables you to quickly and easily create layouts across all screen sizes while reducing the overall development cost.

    https://www.youtube.com/watch?v=KqkUQpsQ2QA

    Watch the Winter episode of #TheAndroidShow

    That’s a wrap on this quarter’s episode of #TheAndroidShow. A special thanks to our co-hosts for the Fall episode, Simona Milanović and Alejandra Stamato! You can watch the full show on YouTube and on developer.android.com/events/show.

    Have an idea for our next episode of #TheAndroidShow? It’s your conversation with the broader community, and we’d love to hear your ideas for our next quarterly episode – you can let us know on X or LinkedIn.





    Source link